Approximately 66% of all global surfaces are covered by clouds [Wilson & Jetz 2016]. While some areas are very rarely covered by clouds, others present cloud occurrences above 75%, which reduces the opportunity of optical sensors to measure a clear surface signal. Thus, an accurate cloud screening algorithm is needed for most downstream applications. To provide reliable and consistent cloud masking algorithm results, an independent validation source that fulfills a set of requirements is needed. While a few studies have quantitatively inter-compared some of the state-of-the-art cloud detection methods using reference datasets, until most recently, no study had compared the used reference datasets themselves.
There are only a few available reference datasets that can be used for cloud mask validation. Before the Cloud Masking Intercomparison eXercise (CMIX), conducted within the Committee on Earth Observation Satellites (CEOS) Working Group on Calibration & Validation (WGCV), no independent analysis on the quality and usability of Sentinel-2 and Landsat 8 reference datasets (Baetens & Hagolle 2018, Hollstein et al. 2016, Paperin et al. 2021a, Paperin at al.2021b, Skakun et al. 2020, U.S. Geological Survey 2016) had been made, nor had they been compared. Results from CMIX revealed that all datasets have shortcomings. One major shortcoming of all reference datasets is the manual - and thus to a certain extent subjective - interaction needed to generate the cloud masks, which potentially introduces a bias and leads to temporarily very limited reference datasets.
In 2020, Skakun et al. showed the usefulness of sky images from ground-based cameras for satellite-based cloud mask validation and presented an inexpensive approach for the generation of such data using a Raspberry-PI based system. While those approaches still relied on manual interaction, the work presented here aims to develop automated procedures for the generation of reference datasets. Within the ESA Quality assurance framework for Earth Observation (QA4EO), and in cooperation with University of Maryland/NASA, a stereo pair of Raspberry-PI based sky cameras were installed at La Sapienza University in Rome. In addition, a new ceilometer, called RAP (Raymetrics Aerosol Profiler) was installed in the QA4EO framework, to validate the sky camera-based cloud heights. Together with an additional pair of cameras placed at the Goddard Space Flight Center in Greenbelt, MD, USA, the Rome site is used as a testbed to develop algorithms for automated reference dataset generation. The goal of the work presented here was to analyze the general requirements for a reference dataset for validation of satellite-based cloud masks, and to evaluate the suitability of the sky camera approach and, if necessary, to propose modifications for the improvement of the measurement setup.
The strength of the tested sky camera approach is a nearly continuous measurement in very short time intervals, allowing the data to be used as validation source for a great number of different optical satellite sensors. However, some challenges are still present in the measurement setup, such as the comparison of clouds observed from different viewing points and the appropriate matching of satellite and camera images which considers the effects of lens distortion. While the approach is under development, if proven robust, the quite inexpensive setup would allow for an expansion towards a global network of sky cameras, potentially providing a unique multi-temporal, near real-time validation source.
[HOLLSTEIN ET AL. 2016] HOLLSTEIN, ANDRÉ, KARL SEGL, LUIS GUANTER, MAXIMILIAN BRELL, AND MARTA ENESCO. 2016. "READY-TO-USE METHODS FOR THE DETECTION OF CLOUDS, CIRRUS, SNOW, SHADOW, WATER AND CLEAR SKY PIXELS IN SENTINEL-2 MSI IMAGES" REMOTE SENSING 8, NO. 8: 666. https://doi.org/10.3390/rs8080666
[BAETENS & HAGOLLE 2018] LOUIS BAETENS, & OLIVIER HAGOLLE. (2018). SENTINEL-2 REFERENCE CLOUD MASKS GENERATED BY AN ACTIVE LEARNING METHOD [DATA SET]. ZENODO. HTTPS://DOI.ORG/10.5281/ZENODO.1460961
[PAPERIN ET AL. 2021A] PAPERIN, MICHAEL, STELZER, KERSTIN, LEBRETON, CAROLE, BROCKMANN, CARSTEN, & WEVERS, JAN. (2021). PIXBOX LANDSAT 8 PIXEL COLLECTION FOR CMIX (VERSION 1.0) [DATA SET]. ZENODO. https://doi.org/10.5281/zenodo.5040271
[PAPERIN ET AL. 2021B] PAPERIN, MICHAEL, WEVERS, JAN, STELZER, KERSTIN, & BROCKMANN, CARSTEN. (2021). PIXBOX SENTINEL-2 PIXEL COLLECTION FOR CMIX (VERSION 1.0) [DATA SET]. ZENODO. https://doi.org/10.5281/zenodo.5036991
[SKAKUN ET AL. 2020] SKAKUN, SERGII; VERMOTE, ERIC; SANTAMARIA ARTIGAS, ANDRES EDUARDO; ROUNTREE, WILLIAM; ROGER, JEAN-CLAUDE (2020), “DATA FOR: AN EXPERIMENTAL SKY-IMAGE-DERIVED CLOUD VALIDATION DATASET FOR SENTINEL-2 AND LANDSAT 8 SATELLITES OVER NASA GSFC”, MENDELEY DATA, V1, DOI: 10.17632/R7TNVX7D9G.1
[SKAKUN ET AL. 2021] SKAKUN, S., VERMOTE, E.F., ARTIGAS, A.E.S., ROUNTREE, W.H., ROGER, J.-C., 2021. AN EXPERIMENTAL SKY-IMAGE-DERIVED CLOUD VALIDATION DATASET FOR SENTINEL-2 AND LANDSAT 8 SATELLITES OVER NASA GSFC. INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION 95, 102253.
[U.S. GEOLOGICAL SURVEY 2016] U.S. GEOLOGICAL SURVEY, 2016. L8 BIOME CLOUD VALIDATION MASKS. U.S. GEOLOGICAL SURVEY, DATA RELEASE. DOI:10.5066/F7251GDH.
[WILSON & JETZ, 2016] WILSON AM, JETZ W (2016) REMOTELY SENSED HIGH-RESOLUTION GLOBAL CLOUD DYNAMICS FOR PREDICTING ECOSYSTEM AND BIODIVERSITY DISTRIBUTIONS. PLOS BIOL 14(3): E1002415. DOI:10.1371/JOURNAL. PBIO.1002415” DATA AVAILABLE ON-LINE AT HTTP://WWW.EARTHENV.ORG/.