|Paper title||Supporting data-intensive algorithm development approaches through a globally representative hyperspectral in situ dataset from inland and coastal waters: A community-initiative|
|Form of presentation||Poster|
Large and globally representative in situ datasets are critical for the development of globally validated bio-optical algorithms to support comprehensive water quality monitoring and change detection using satellite Earth observation technologies. Such datasets are particularly scarce and geographically fragmented from inland and coastal waters. This is at odds with the importance of these waters for supporting human livelihoods, biodiversity, and cultural and recreational values. These shortcomings create two challenges. The first and major challenge is to collate these datasets and assess their compatibility concerning methodologies used and quality control procedures applied. The second challenge is to identify biases and gaps in the global dataset, in order to better direct future data collection efforts.
Our ongoing effort is to improve the availability of such datasets by providing open access to a large global collection of hyperspectral remote sensing reflectance spectra and concurrently measured Secchi depth, chlorophyll-a (Chla), total suspended solids (TSS), and absorption by colored dissolved organic matter (acdom). This dataset represents an expansion of data originally collated for a collaborative NASA-ESA-led exercise to assess the performance of atmospheric correction processors over inland and coastal waters (ACIX-Aqua). Its suitability for the development of globally applicable algorithms has been demonstrated by its use for developing novel approaches for the retrieval of Chla and TSS concentrations from a range of satellite sensors.
Our dataset contains relevant entries from the commonly used SeaWiFS Bio-optical Archive and Storage System (SeaBASS) and Lake Bio-optical Measurements and Matchup Data for Remote Sensing (LIMNADES) data archives and, in return, contributes thousands of new entries to these and other repositories. It encompasses data from inland and coastal waters distributed across five continents and a comprehensive range of optical water types. Our accompanying biogeographical data analysis contributes to a value-added dataset to aid in the identification of underrepresented geographical locations and optical water types, useful for targeting future data collection efforts.
To ensure the ease of use of this dataset and support the analysis of uncertainties and algorithm development, metadata covering the viewing geometry and environmental conditions were included in addition to hundreds of matched scene IDs for a number of multispectral satellite sensors (e.g. roughly 450 clear-sky match-ups for Landsat 8’s Operational Land Imager (OLI)), making it easier to validate algorithm performance in practical applications.
In curating this dataset, we had to overcome considerable challenges, including technical difficulties, such as variable measurement ranges of instruments, and others due to the fact that the data originated from a community-initiative of multinational researchers working on projects with a diverse range of objectives. Substantial data harmonization efforts to align different instrumentation, field methodologies, and processing routines were needed.
We conclude, our effort was a very worthwhile undertaking as demonstrated by a series of novel contributions and the publication of eight peer-reviewed research articles (at the time of writing). We expect that open access to this dataset will support the development of increasingly data-intensive algorithms for the retrieval of water quality indicators, including those for next-generation hyperspectral satellite sensors, e.g. sensors from the upcoming Surface Biology and Geology (SBG), Environmental Mapping and Analysis Program (EnMap), PRecursore IperSpettrale della Missione Applicativa (PRISMA) Second Generation (PSG), Copernicus Hyperspectral Imaging Mission for the Environment (CHIME), and FLuorescence EXplorer (FLEX) missions. We believe that this will stimulate the discussion of a framework for the future collection of fiducial reference data towards global representativeness.