Day 4

Detailed paper information

Back to list

Paper title Semi-supervised Sentinel-2 tree species detection
Authors
  1. Daniele Fantin SCIENCE AND TECHNOLOGY AS Speaker
  2. Martijn Vermeer Science and Technology AS
  3. David Völgyes Science and Technology AS
Form of presentation Poster
Topics
  • C1. AI and Data Analytics
    • C1.04 AI4EO applications for Land and Water
Abstract text Essential for forest management is the availability of a complete and up to date forest inventory. Typically forest inventories store information about forest stands, these are roughly uniform areas within the forest that are managed as a single unit. One of the most important parameters of the forest stand is the volumetric tree species distribution. Within Norway there are three main tree species used for production: Norwegian Spruce, Scots Pine and Birch. Currently the determination of the tree species distribution per stand is done manually. The inspection is done by a forestry expert mostly by visual interpretation of aerial imagery and in some cases lidar data. The tree species mapping is therefore expensive, error prone and time consuming, as a result forest inventories are often incomplete and/or outdated.

Deep learning (DL) is getting ubiquitous in state of the art land cover classification. Previous approaches on tree species detection in Norway either used classic machine learning approaches, were evaluated on small areas and haven’t considered label noise and limited data. Currently S&T is already exploiting CNNs for the segmentation of aerial imagery to derive tree species, however there are several challenges.
First of all, aerial imagery in Norway is only available approximately every 5th year. Although aerial imagery provides very high spatial resolution of around 0.2m, the spectral and temporal resolution is limited. Sentinel-2 (S2) could complement aerial imagery by providing a higher spectral and temporal resolution. Especially birch stands could potentially be distinguished by tracking spectral change throughout the year.
Another major challenge is the availability and quality of reference data. Although data is available for different municipalities across the country, there are large areas without labeled data, furthermore existing labels are imperfect containing some degree of noise. The limited quantity and quality of reference data is a challenge in general when working with earth observation and deep learning.
Noise robust and semi-supervised training schemes could address the limited quality and quantity of reference data. Recent developments of semi-supervised learning in other fields, such as image classification and natural language processing, show very promising results. However, the usefulness of these approaches have not yet been fully explored in earth observation.

This project builds upon previous efforts and tries to address the challenges described above. The main objective is to improve automated tree species classification from remotely sensed data over Norwegian production forests by exploiting advanced DL techniques. Secondary objectives are: 1) exploiting S2 for improved birch detection 2) investigate noise detection and noise robust techniques for handling limited quality reference labels 3) investigate semi-supervised techniques for handling limited quantity reference labels.

The main approach will be to train various relatively standard CNN baseline models and compare different improved models to these baselines in order to evaluate the impact of different techniques. The study focuses on 3 main things:
1) Sentinel-2: The incorporation of S2 as a data source in addition to aerial imagery. This will be done by fusing S2 and aerial imagery and training a model on the combined dataset. Fusing will be done either by resampling to the same grid or designing a custom CNN where S2 data enters the network at a deeper stage after several pooling layers.
2) Noise detection and noise robust training: Multiple models will be trained with different amounts of artificial noise added to the training data using both a standard and noise robust training scheme. By comparing the standard training scheme with the noise robust scheme the effectiveness of noise robust training can be evaluated. In addition, area under the margin (AUM) ranking will be used to identify mislabeled data.
3) Semi-supervised: Multiple models will be trained on training sets of reduced size, e.g. reduction by 20%, 40%, 60%, etc. Secondly, the unused training data will be added as unlabeled samples to the training scheme, recovering some of the accuracy loss originating from the reduced amount of labels. In this way the effectiveness of the semi-supervised approach can be evaluated. One particular semi-supervised approach that will be evaluated is the consistency loss.

The direct impact of this study will be improved tree species detection over Norway. However more importantly the study aims to contribute to the more general challenges of dealing with limited quantity and quality reference data within DL for earth observation.

The final results of the project will be published in a peer reviewed scientific journal. The project kicked off in October 2021 and it will last one year.