Day 4

Detailed paper information

Back to list

Paper title A Physics-based ML approach for soil moisture estimation with simulated SAR data
  1. Lorenzo Giuliano Papale Tor Vergata University of Rome Speaker
  2. Giovanni Schiavon Tor Vergata University of Rome
  3. Fabio Del Frate Tor Vergata University of Rome
  4. Leila Guerriero Università di Roma Tor Vergata
Form of presentation Poster
  • C1. AI and Data Analytics
    • C1.07 ML4Earth: Machine Learning for Earth Sciences
Abstract text In recent years, Artificial Intelligence (AI), in particular Machine Learning (ML) algorithms, have demonstrated to be a valuable instrument for Earth Observation (EO) applications designed to retrieve information from Remote Sensing (RS) data. ML-based techniques have made a notable advancement in Earth Observation applications so that the acronym AI4EO (Artificial Intelligence for Earth Observation) has caught on in recent studies, publications and initiatives. The vast amount of available data has led to a change from the traditional geospatial data analysis approaches. Indeed, ML techniques are often used to transform data into valuable information representing real-world phenomena. Nevertheless, the lack or shortage of labelled data and ground truth is one of the most critical obstacles to applying ML supervised algorithms. Indeed, the feasibility of labelled data generation varies depending on the EO application type. Specifically, data labelling can be performed directly by the EO data users for object detection and land cover applications by manual or automatic mapping, while geophysical parameters labelling is challenging to perform and in-situ measurements, in most cases, are limited and hard to retrieve.
Moreover, the risk that occurs when data-driven approaches such as ML models are adopted is that it becomes difficult to understand the intrinsic relations between the input variables and the physical meaning behind the mapping criteria taking place inside the Artificial Neural Networks (ANN). To avoid such a “black-box” approach, the proposed work offers the chance to synergically adopt electromagnetic data modelling and ML models design and development.
In this regard, during the last 30-40 years, scientists and researchers have proposed and developed several electromagnetic models based on the radiative transfer theory, suitable for large dataset generation for AI applications. In particular, electromagnetic models allow a dataset collection, simulating radar acquisitions (for different sensor configurations, e.g., signal frequency, polarization, and incidence angle), which would be more laborious and time-consuming to obtain with real data (i.e., satellite measurements).
Particularly, the Tor Vergata model, developed by Ferrazzoli et al. [1], has been employed for simulating the radar backscatter coefficients for different signal frequencies and polarizations. It is based on the radiative transfer theory applied to discrete dielectric scatterers of simple shapes: cylinders (able to model trunks, branches and stalks) and disks (to model leaves). It applies the “Matrix doubling” algorithm [2], which models scattering interactions (including attenuation and propagation mechanisms) of any order between the soil and the vegetation cover.
Being validated with several experimental data, in this work, the Tor Vergata model has provided the possibility of simulating a vast amount of reference data with different values of vegetation- and soil-related variables (crop biomass, plant structure and soil moisture/ roughness) and sensor configuration variables such as frequency, polarization and incidence angle. The result of those simulations consists of an extensive dataset (comprising the several soil-vegetation-sensor combinations) which has been used to train different ML models. Indeed, the scope of this work is to perform a direct analysis of the information content of the radar measurements through an extended saliency analysis of the topological links composing the artificial neural networks to extract the most significant input features (i.e., the backscatter simulations at different frequencies) for soil moisture retrieval. Besides, a quality assessment for diverse ML model architectures and hyper-parameters selection is provided to evaluate model performances and the dataset generation procedure.
Eventually, it will also be shown how the information obtained from the feature importance extraction procedure can be used for actual satellite measurements employment by assessing the sensitivity of the different wavelengths of the radar signal for each plant height. At the same time, this work intends to demonstrate that ML models can reproduce the expected physical relations depending on the different study cases by avoiding a “black-box” strategy and, on the contrary, by adopting a physics-based approach.


[1] Ferrazzoli, P., Guerriero, L., & Solimini, D. (1991). Numerical model of microwave backscattering and emission from terrain covered with vegetation. Appl. Comput. Electromagn. Soc. J, 6, 175-191.

[2] Bracaglia, M., Ferrazzoli, P., & Guerriero, L. (1995). A fully polarimetric multiple scattering model for crops. Remote Sensing of Environment, 54(3), 170-179.