|Paper title||ESA Agriculture Virtual Laboratory (AVL): an online community open science tool for Earth Observation agricultural scientists|
|Form of presentation||Poster|
In September 2020, ESA launched a new Virtual Lab focusing on Agriculture (AVL). Virtual labs are platform services for scientists to share data resources and create an enhanced research environment. AVL is designed to be an online community open science tool to share results, knowledge and resources. Agriculture scientists can access and share Earth Observation (EO) data, high-level products, in-situ data, as well as open-source code (algorithms, models, tools) to carry out scientific studies and projects.
The technical system behind the AVL comprises two main building blocks, namely the “Thematic processing subsystem” powered by TAO (Tool Augmentation by user enhancements and Orchestration), which is an orchestration and integration framework for remote sensing processing, and the “Exploitation Subsystem” powered by xcube and Sentinel Hub, a software for generation, management, exploitation, and service provisioning of analysis-ready data cubes.
The “Thematic processing subsystem” is a collection of self-contained (i.e., packed in Docker containers) applications or systems, that produce value-added EO products such as biophysical variables, crop masks, crop types, etc. It integrates commonly used toolboxes (e.g., SNAP, Orfeo Toolbox, GDAL, Sen2-Agri, Sen4CAP, etc.) into a single environment enabling end-users to define by themselves processing workflows and to easily integrate additional processing modules.
The “Exploitation subsystem” ingests data streams including the ones provided by the Thematic processing subsystem and makes them available as analysis-ready data cubes. Data streams may be gridded, like EO sensor data or model data, or feature data, like time series of points or shapes. The latter are stored in geoDB, a database for various data types with geographical context. The “Exploitation subsystem” provides users with individual workspaces and offers different interfaces, specifically the data cube toolbox Cate, a Jupyter Lab environment, and the interface to the thematic processing subsystem.
The implementation of the AVL system is following an agile approach, prepared to account for new requirements, particularly from relevant users from the agriculture science community. With respect to the onboarding of users, the project is structured into three phases. First, a couple of well-defined user stories provide the requirements for the implementation of the first use cases via iterative development cycles. These use cases are executed in partnership with Champion Users who are leading scientists belonging to the community and/or international stakeholders (JECAM, GEOGLAM, CGIAR, GEWEX, FAO, GEO).
The first use case is about the portability of classification models in space (i.e. from one region to another) and over time (i.e. from one year to another), which would certainly be one of the best options to deal with the in-situ data scarcity. Different methodologies to transfer the classification models exist: identifying and using invariant features that are valid in the source and target domains, aligning the time series between the two domains (using for instance time warping), training the classification model by using (i) data from source and target domains together or (ii) data only from the source and then adapt the model to the target domain by fine-tuning it on the available target train data. These options are evaluated over two test sites in Belgium and France and two years 2019 and 2020, based on Sentinel-1 and Sentinel-2 sensors and in situ data coming from the French and Belgian Land Parcel Identification System datasets. This first use case can then be expanded over more sites, involving the JECAM community.
The second use case is about the monitoring of sustainable agricultural practices supporting the necessary evolution of agriculture to become more compatible with the expectations of society at large and with the Green Deal ambitions at the European level. Within this use case, crop-specific monitoring at field-level throughout the year is carried out to monitor a selection of sustainable agricultural practices: winter cover crop and biomass indication, harvest/destruction detection, bare soil period detection, evapotranspiration retrieval as an indicator of water stress.
The third use case will be either about the estimation and forecast of crop yield or an inter-comparison exercise of crop maps within the GEOGLAM initiative.
The second development phase will involve Early Adopters as the first external scientists using and testing the AVL. While the advanced science use cases cover hot topics in the Agriculture Science to maximize the impact in terms of AVL in the community, the Early Adopters studies will demonstrate that AVL can be useful for a variety of applications, offering a large diversity and huge amount of input EO and non-EO data and providing a unique and innovative collaborative framework to access, process and visualize these data. Their feedback will support the transition to the third, operational phase, which will open the AVL to the wider scientific community.
One of the keys to the AVL's success will be the data offer - satellite data, in-situ data, thematic products and auxiliary data. A comprehensive user survey has been conducted in the first months of the project, identifying the users’ priorities and maximizing the offer of relevant data for the agriculture science community will remain a focus throughout the project. Furthermore, as Open Science project, the AVL will promote and foster the collaboration between scientists and the sharing of data, products, results and source code (joint publications, inter-comparison exercise, benchmarking, etc.). Specific activities will also be carried out to build a strong AVL user community and facilitate Open Science, such as regular webinars, a dedicated forum, and the organization of hackathons or competitions.