|Paper title||Crop Classification Synthetic Training Data Generation With Use Of Generative Adversarial Network|
|Form of presentation||Poster|
Crop classification task is still have a big number unsolved problems that require new methods and instruments for the achieving maximum reliability of mapping. Some of these problems can partly be solved by use of state-of-the-art computer vision techniques that give possibility to build very accurate land cover and crop type maps. Such methods already showing a good performance on the small experiment sites all around the world. However, the use of such methods on the regional or even country level is still very challenging task. And this challenge even not in the use of big number of computational resources. Modern convolutional deep learning methods are require training data in special format. Usually, it is manually fully labeled squares of fixed size.
In terms of data collection forming for the crop classification task, the biggest problem is impossibility of accurate photointerpretation of crop features for the training data labelling. The real-life crop profiles are having very high variativity of features and analysis of NDVI sequences or just visual analysis of true color or any other satellite bands combination will not be accurate or reliable. So, the only way to form a good training or validation dataset is ground survey on the territory of interest’ roads. This fact also creates another problem – the real-life distribution of crop types on land is not uniform. The resulting data sets are very unbalanced in terms of machine learning. It is common situation, when most ground truth samples are representing majoritarian crop classes, while minoritarian classes can be represented only in a few samples. In the pixel-based classification such problem can be fixed in various ways. The most common method is usage of fixed number of pixels read from satellite data for each field to make the distribution of pixels in each classes uniform and balanced. Or another way is extending the number of pixels for minoritarian classes by simulation of new values based on the available with addition of random noise to these values for dispersion control and overfitting avoidance. However, such approaches are not working with convolutional neural networks. If ground truth data collection contains thousands of fields, it is possible to estimate millions of pixels from moderate or high resolution satellite data for pixel-based classification. But in the same way they can cover only a few hundreds of fully labeled squares for the segmentation task. This why, in the task of crop classification, the development of robust ground truth data simulation methods is very promising.
In this work we are presenting a new method of synthetic training data generation for the crop classification task based on the deep Generative Adversarial Network (GAN). This method uses computer vision approach – image to image translation. We trained the GAN neural network based on the available ground truth data to generate time-series of VV and VH polarization bands of Sentinel-1 based on the segmentation masks with 256x256 size. The resulting model give us possibility to simulate realistic images with different distributions of minoritarian classes. Combination of simulated by this method data with real data gave us possibility to estimate better recognition of minoritarian classes on crop classification maps build with use of U-net - deep convolutional neural network architecture.