|Paper title||Studies on the validation of machine learning classification results from multitemporal, multispectral Sentinel-2 data using the example of agricultural crop classification in Brandenburg (Germany).|
|Form of presentation||Poster|
Due to the size of the acreage and importance in the production of food, feed and raw materials, agricultural land is an appropriate target for RS applications. Additionally, agricultural production is affected by a variety of spatially and temporally varying environmental factors (e.g., diseases and water content), which ensure stable, renewable production of high-quality food, raw materials, and bioenergy. Environmental changes and increasing extreme weather events are also putting a strain on production conditions. Therefore, the application-oriented provision of information is a key prerequisite for a flexible and fast reaction of farmers to the changing environmental conditions.
Against this background, technologies are being adapted and developed that enable the rapid identification and classification of objects and phenomena. In agriculture, this often involves identifying agricultural crops and their growth development in order to plan and effectively implement suitable agronomic measures.
For this purpose, a processing chain was developed whose core routine for analyzing multitemporal data of the Sentinel-2 satellite is based on machine learning methods (Random Forest, XGBoost, Neural Network, SVM). As validation basis for developing our method, the land parcel shape data of the land survey and geo-spatial information office of the federal state of Brandenburg were used as ground truth data. These data are based on farmers’ reports on agricultural subsidy applications (Common Agricultural Policy - CAP of the European Union - EU) for the agricultural areas of 2018. The number of remote sensing data sets amounted 343 scenes and their meta data and was available for the whole federal state Brandenburg.
The results of our investigations can be summarized as follows:
1. The testing methodology has shown that dividing the study area into training areas and test areas is a solid way to validate the model. Simple training on the entire data set is insufficient to build a model that can classify crops in new regions of the federal state Brandenburg.
2. Natural influence factors such as phenological grow stages, regional environmental conditions, data quantity of collected cloud-free observations in each region, and the complex spectral variety in each region, making it challenging to train a model that can generalise the training data well.
3. Furthermore, the test methodology provides a framework for models such as Random Forest, XGBoost, Neural Network, SVM, but also any other classification system.
The core of our results is an integrated testing methodology that validates the generalizability of trained machine learning models and provides conclusions about how well crops can be identified in previously new regions.