|Paper title||Updating the Walloon land cover map by operational application of supervised deep learning segmentation model|
|Form of presentation||Poster|
This abstract aims to highlight how a novel approach based on a deep learning segmentation model was developed and implemented to generate land cover maps by fusing multiple data sources. The solution was tailored to put greater emphasis on improving its robustness, simplifying its architecture, and limiting its dependencies.
To deal with the regional environmental, climatic, and territorial management challenges, authorities effectively need precise and frequently updated representation of the fast-changing urban-rural landscape. In 2018, the WALOUS project was launched by the Public Service of Wallonia, Belgium, to develop reproducible methodologies for mapping Land Cover (LC) and Land Use (LU) (Beaumont et al. 2021) on the Walloon region. The first edition of this project was led by a consortium of universities and research centre and lasted 3 years. In 2020, the resulting LC and LU maps for 2018, based on an object-based classification approach (Bassine et al. 2020), updated the outdated 2007 map (Baltus et al 2007) and allowed the regional authorities to meet the requirements of the European INSPIRE Directive. However, although end-users suggested that regional authorities should be able to update these maps on a yearly basis according to the aerial imagery acquisition strategy (Beaumont et al. 2019), the Walloon administration quickly realized that it does not have the resources to understand and reproduce the method because of its complexity and relatively concise handover. A new edition of the WALOUS project started in 2021 to bridge those gaps. AEROSPACELAB, a private Belgian company, was selected for WALOUS’s 2nd edition thanks to its promise to simplify and automate the LC map generation process thanks to a supervised deep learning segmentation model.
A LC map assigns to each pixel of a georeferenced raster a class describing its artificial or natural cover. Hence, the task for the model is to predict the class to associate to each pixel, resulting in a map semantically segmented. Several approaches have been suggested in the literature to solve this task. Those can often be regrouped in three main categories, each having its own strengths and weaknesses:
• Pixel-based classification
These models classify each pixel independently of their neighbors. This lack of cohesion between the classification of neighboring pixels can result in speckle or “salt and pepper” effect (Belgiu et al. 2018). Another drawback of this approach is its inference time.
• Object-based classification
The classification is done for a group of pixels simultaneously, hence reducing the speckle effect and the inference time. However, the question of how to group the pixels into homogeneous objects must now be addressed. A spatial, temporal, and spectral-based clustering algorithm has to be defined to avoid over-segmentation and under-segmentation.
• Deep Learning segmentation
Deep Learning segmentation models do not require has much feature engineering. The segmentation and classification of the pixels will be done simultaneously ensuring a strong cohesion in the resulting predictions. However, those models are prone to propose smooth object boundaries instead of the sharper ones from the other approaches. This can be seen as a drawback when segmenting artificial objects which often have clear boundaries. This is less of a concern when segmenting natural classes which transitions are less clearly defined.
The solution implemented for WALOUS’s 2nd edition revolves around a Deep Learning segmentation model based on the DEEPLAB V3+ architecture (Chen et al. 2017) (Chen et al. 2018). This architecture was selected to facilitate the segmentation of objects with different scales. Lakes, forests, and buildings are all examples of objects that can indeed be of observed with different scales on aerial imagery. Segmenting those objects existing at multiple scales can be challenging for the model as its fields-of-view might not be dimensioned appropriately. However, DEEPLAB V3+’s main distinguishing features: atrous convolutions, and atrous spatial pyramid pooling alleviate this problem without having too much impact on the inference time. This is all permitted thanks to the atrous convolutions which widen the fields-of-views without increasing the kernel’s dimensions. Slight technical adjustments have been made to this architecture to tailor it to the task: on the one hand, the segmentation head was adjusted to comply with the 11 classes representing the different ground covers, on the other hand, the input layer was altered to cope with the 5 data sources. Figure 2 offers a high-level overview of the overall architecture of the solution.
Data fusion was a key aspect of this solution as the model was trained on various sources with different spatial resolutions:
• high-resolution aerial imagery with 4 spectral bands (Red, Blue, Green, and Near-Infrared) and a ground sample distance (GSD) of 0.25m;
• digital terrain model obtained via LiDAR technology; and
• digital surface model derived from the aforementioned high-resolution aerial imagery by photogrammetry.
The pre-trained model was initially trained using WALOUS’s previous edition LC map (artificially augmented), and then a fine-tuning phase was performed on a set of highly detailed and accurate LC tiles that were manually labelled.
As many model architectures and data sources have been considered, the model was implemented with the open-source DETECTRON2 framework (Wu et al. 2019) which allows for rapid prototyping. Among these initial prototypes, a POINTREND extension (Kirillov et al. 2020) was studied to improve the segmentation of the model at objects’ boundaries, and a ConvLSTM was implemented to segment satellite imagery with high temporal and spectral resolutions such as Sentinel-2 (Rußwurm et al. 2018) and facilitate the discrimination of classes that have similar spectral signatures on a single high (spatial) resolution imagery but very distinguishable spectral signatures when sampled over a year (i.e.: softwood versus hardwood or grass cover versus agricultural parcel).
The final model segments Wallonia in 11 classes ranging from natural – grass cover, agricultural parcel, softwood, hardwood, and water – to artificial – artificial cover, artificial construction, and railway – covers. It achieves an overall accuracy of 92.29% on the test set consisting of 1710 points photo-interpreted. Figure 2 gives an overview of the various predictions (GSD: 0.25m) made by the model. Moreover, besides updating the LC map, the solution also compares the new predictions with the previous LC map and derives a change map highlighting, for each pixel, the LC transitions that may have arisen during the two studied years.
In conclusion, the newly implemented algorithm generated the new 2019 and 2020 LC maps, resampled at 1m/pixel. Those have been published in early 2022. And, although relying on less data sources and requiring less features engineering than the object-based classification model implemented for the first edition of the WALOUS project, this new approach shows similar performance. Its reduced complexity played a favorable role in its appropriation by the local authorities. Finally, the public administration will be trained to be able to make use of the AI algorithm with each new annual aerial images.
Baltus, C.; Lejeune, P.; and Feltz, C., Mise en œuvre du projet de cartographie numérique de l’Occupation du Sol en Wallonie (PCNOSW), Faculté Universitaire des Sciences Agronomiques de Gembloux, 2007, unpublished
Beaumont, B.; Stephenne, N.; Wyard, C.; and Hallot, E.; Users’ Consultation Process in Building a Land Cover and Land Use Database for the Official Walloon Georeferential. 2019 Joint Urban Remote Sensing Event (JURSE), Vannes, France, 1–4. doi:10.1109/JURSE.2019.8808943
Beaumont, B.; Grippa, T.; Lennert, M.; Radoux, J.; Bassine, C.; Defourny, P.; Wolff, E., An Open Source Mapping Scheme For Developing Wallonia's INSPIRE Compliant Land Cover And Land Use Datasets. 2021.
Bassine, C.; Radoux, J.; Beaumont, B.; Grippa, T.; Lennert, M.; Champagne, C.; De Vroey, M.; Martinet, A.; Bouchez, O.; Deffense, N.; Hallot, E.; Wolff, E.; Defourny, P. First 1-M Resolution Land Cover Map Labeling the Overlap in the 3rd Dimension: The 2018 Map for Wallonia. Data 2020, 5, 117. https://doi.org/10.3390/data5040117
Chen, L.-C., Papandreou, G.; Schroff, F.; Adam, H., Rethinking Atrous Convolution for Semantic Image Segmentation. Cornell Univeristy / Computer Vision and Pattern Recognition. December 5, 2017.
Chen, L.-C., Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H., Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. ECCV. 2018
Wu, Y.; Kirillov, A.; Massa, F.; Lo, W.-Y.; Girshick, R., Detectron2. https://github.com/facebookresearch/detectron2. 2019.
Kirillov, A.; Wu, Y.; He, K.; Girshick, R., PointRend: Image Segmentation as Rendering. February 16, 2020.
Rußwurm, M.; Korner, M., Multi-Temporal Land Cover Classification with Sequential Recurrent Encoders. International Journal of Geo-Information. March 21, 2018.
Belgiu, M.; Csillik, O., Sentinel-2 cropland mapping using pixel-based and object-based time-weighted dynamic time warping analysis. Remote Sensing of Environment. 2018, pp. 509-523.