Day 4

Detailed paper information

Back to list

Paper title Software package for generating synthetic SAR interferograms as training datasets for machine learning algorithms
  1. István Bozsó Institute of Earth Physics and Space Science (ELKH EPSS) Speaker
  2. Tamás Bozóki Institute of Earth Physics and Space Science (ELKH EPSS)
  3. András Horváth Pázmány Péter Catholic University
  4. Lukács Kuslits Institute of Earth Physics and Space Science (ELKH EPSS)
  5. Máté Timkó Institute of Earth Physics and Space Science (ELKH EPSS)
Form of presentation Poster
  • Open Earth Forum
    • C5.03 Open Source, data science and toolboxes in EO: Current status & evolution
Abstract text In the last decade advancement in CPU and GPU performance, the availability of large datasets and the proliferation of machine learning (ML) algorithms and software libraries made daily use of ML as a tool not only a possibility, but a routine task in many areas.
Unsupervised and supervised classification, a precursor to more sophisticated ML algorithms, have been extensively used in many scientific areas and have allowed researchers to recognize patterns, reduce subjective bias in categorization and help deal with large datasets. Classification algorithms have been widely used in remote sensing to efficiently identify areas with similar surface coverage and scattering characteristics (urban, agricultural, forest, flooded areas, etc.). Indeed remote sensing is a prime target for developing ML algorithms as the volume and diversity (more frequency channels, multiple satellites) and availability of freely accessible datasets is increasing year-by-year.
The advent of the Copernicus Earth observing program's Sentinel satellites started a new era in satellite remote sensing. The datasets produced by the Sentinel satellites, a vast database of remote sensed images surpassing in volume any previous satellite image database, is available to use by the public. This allowed remote sensing specialists and geoscientists to train and apply ML models utilizing the dataset provided by Copernicus to solve a wide range of processing challenges and classification problems that arise when dealing with such volumes of data.
Synthetic Aperture Radar (SAR) is a relatively novel remote sensing technology that allows the observation of the surface of the Earth in the microwave spectrum. ESA has been a pioneer in utilizing satellite mounted SAR antennas as a means of microwave Earth observation (ERS-1 and 2, Envisat) and the twin Sentinel-1 A and B satellites continue that tradition as dedicated SAR satellites in the Copernicus fleet.
SAR remote sensing has many advantages over „classical” remote sensing, that operates in and around the visible range of the electromagnetic (EM) spectrum. It is an active remote sensing technique, as such it is not dependent on external EM wave sources (e.g. the Sun) and the emitted microwaves are not absorbed by cloud cover and other atmospheric phenomena. Furthermore it is a coherent sensing technique, meaning that the amplitude and phase values of the reflected EM wave are captured. Phase information can be used to create so-called interferograms by subtracting the phase values of a primary SAR image from a secondary one.
The phase difference stored in an interferogram, the interferometric phase, depends on many components, such as the difference of satellite positions when the two images were taken, surface topography, change in atmospheric and ionospheric conditions, the satellite line-of-sight (LOS) component of surface deformation and other factors. By subtracting components other than the deformation component it is possible to estimate the surface deformation map of the imaged area. A critical step in processing the interferogram is the so-called phase unwrapping, which restores 2 \pi phase jumps in phase time and spatial variations, since the phase itself is periodic (wrapped phase).
Phase unwrapping is a non-linear and non-trivial problem. Its success depends on the quality of input interferograms and selected preprocessing step configuration (filters, masking out of incoherent areas, leaving out interferograms from processing).
Many software packages exist that implement some form of phase unwrapping algorithm that have been used successfully in many surface deformation studies (volcano deformation monitoring, detection of surface deformation caused by earthquakes, displacements caused by mining activities, etc.). Despite these successes, phase unwrapping remains a challenge in the field of SAR interferometry (InSAR).
In order to train a ML algorithm a training dataset is necessary, which provides expected outputs to selected inputs. During training a subset of the training database is selected for the actual training of deep neural networks and the rest is used for the validation of that trained algorithm.
ML can be a powerful tool and many interferogram processing steps (removal of atmospheric pase, phase unwrapping, detection of deformation) could benefit from incorporating it in some form. However modern ML algorithms require a vast amount of data and the manual acquisition and labeling of datasets is a cumbersome and tedious task.
Although a substantial amount of interferometric data can be derived from Sentinel-1 A and B SAR images, the (pre)processing and creation of interferograms remains a computationally costly operation. The issue of creating a training dataset of interferograms that can be utilized in various ML frameworks is still unresolved. A perhaps bigger problem is the lack of expected “output” values that are paired with input interferograms (e.g. atmospheric phase delay, unwrapped phase values).
Training on synthetic data is a current trend in ML and applied along with transfer learning and domain adaptation this approach has achieved breakthroughs in various applications. The authors set out to create a software package / library that can be reliably used to generate synthetic interferograms. The package is written in the Python programming language, utilizing its vast ecosystem of scientific libraries. The choice of programming language also allows easy integration with existing ML frameworks available in Python. Different parts of the interferogram generation, such as atmospheric delay and noise generation, as well as the deformation model and its parameters, can be individually configured and replaced by end user defined algorithms, making the code open for extensions.
Creation of synthetic interferograms can also be utilized in the education and training of future InSAR specialists. By tweaking the configuration of interferogram generation aspiring specialists are able to estimate how a change in different parameters (e.g. strength of atmospheric noise, satellite geometry) changes the interferometric phase and the outcome of phase unwrapping.