Day 4

Detailed paper information

Back to list

Paper title The CoMet toolkit – Uncertainties made easy
Authors
  1. Pieter De Vis National Physical Laboratory Speaker
  2. Samuel E. Hunt National Physical Laboratory NPL
Form of presentation Poster
Topics
  • Open Earth Forum
    • C5.03 Open Source, data science and toolboxes in EO: Current status & evolution
Abstract text Environmental observations from satellites and in-situ measurement networks are core to understanding climate change. Such datasets need to have uncertainty information associated with them to ensure their credible and reliable interpretation. However, this uncertainty information can be rather complex, with many sources of error affecting the final products. Often, multiple measurements are combined throughout the processing chain (e.g. performing temporal or spatial averages). In such cases, it is key to understand error-covariances in the data (e.g., random uncertainties do not combine in the same way as systematic uncertainties). This is where approaches from metrology (the science of measurement) can assist the Earth observation (EO) community to develop quantitative characterisation of uncertainty in EO data. There have been numerous projects aimed at developing (e.g. QA4ECV, FIDUCEO, GAIA-CLIM, QA4EO, MetEOC, EDAP) and applying (e.g. FRM4VEG, FRM4OC, FDR4ALT, FDR4ATMOS) a metrological framework to EO data.

Presented here is the CoMet toolkit, which stands for “Community tools for Metrology”, which has been developed to enable easy handling and processing of dataset error-covariance information. This toolkit aims to abstract away some of the complexities in dealing with covariance information. This lowers the barrier for newcomers, and at the same time allows for more efficient analysis by experts (as the core uncertainty propagation does not have to be reimplemented every time). The CoMet toolkit currently consists of a pair of python modules, which will be described in detail.

The first module, obsarray, provides an extension to the widely used xarray package to interface with measurement error-covariance information encoded in datasets. Although storage of full error-covariance matrices for large observation datasets is not practical, they are often structured to an extent that allows for simple parameterisation. obsarray makes use of a parameterisation method for error-covariance information, first developed in the FIDUCEO project, stored as attributes to uncertainty variables. In this way the datasets can be written/read in a way that this information is preserved.

Once this information is captured, the uncertainties can be propagated from the input quantities to uncertainties on the measurand (the processed data) using standard metrological approaches. The second CoMet python module, punpy (standing for `Propagating Uncertainties in Python’), aims to make this simple for users. punpy allows users propagate obsarray dataset uncertainties through any given measurement function, using either the Monte Carlo (MC) method or the law of propagation of uncertainty, as defined in the Guide to the expression of Uncertainty in Measurement (GUM). In this way, dataset uncertainties can be propagated through any measurement function that can be written as a python function – including simple analytical measurement functions, as well as full numerical processing chains (which might e.g. include external radiative transfer simulations), as long as these can be wrapped inside a python function. Both methods have been validated against analytical calculations as well as other tools such as the NIST uncertainty machine.

punpy and obsarray have been designed to interface with each other. All the uncertainty information in the obsarray products can be automatically parsed and passed to punpy. A typical approach would be to separately propagate the random uncertainties (potentially multiple components combined), systematic uncertainties and structured uncertainties, and return them as an obsarray dataset that contains the measurand, the uncertainties and the covariance information of the measurand. Jupyter notebooks with tutorials are available. In summary, by combining these tools, handling uncertainties and covariance information has become as straightforward as possible, without losing flexibility.