Day 4

Detailed paper information

Back to list

Paper title Big Earth Observation Data Analysis using Satellite Image Time Series
  1. Gilberto Camara National Institute for Space Research (INPE) Speaker
  2. Rolf Simoes INPE (Brazilian National Institute for Space Research)
  3. Felipe Souza INPE (Brazilian National Institute for Space Research)
  4. Alber Sanchez INPE (Brazilian National Institute for Space Research)
  5. Karine Reis Ferreira Brazilian National Institute for Space Research - INPE -Brazil
Form of presentation Poster
  • C1. AI and Data Analytics
    • C1.04 AI4EO applications for Land and Water
Abstract text The emergence of cloud computing services capable of storing and processing big EO data sets allows researchers to develop innovative methods for extracting information. One of the relevant trends is to work with satellite image time series, which are calibrated and comparable measures of the same location on Earth at different times. When associated with frequent revisits, image time series can capture significant land use and land cover changes. For this reason, developing methods to analyse image time series has become a relevant research area in remote sensing.

Given this motivation, the authors have developed *sits*, an open-source R pack.age for satellite image time series analysis using machine learning. The package in.corporates new developments in image catalogues for cloud computing services. It also includes deep learning algorithms for image time series analysis published in recent papers. It has innovative methods for quality control of training data. Parallel processing methods specific for data cubes ensure efficient performance. The package provides functionalities beyond existing software for working with big EO data.

The design of the *sits* package considers the typical workflow for land classification using satellite image time series. Users define a data cube by selecting a subset of an analysis-ready data image collection. They obtain the training data from a set of points in the data cube whose labels are known. After performing quality control on the training samples, users build a machine learning model and use it to classify the entire data cube. The results go through a spatial smoothing phase that removes outliers. Thus, *sits* supports the entire cycle of land use and land cover classification.

Using the STAC standard, *sits* supports the creation of data cubes from collections available in the following cloud services: (a) Sentinel-2 and Landsat-8 from Microsoft Planetary Computer; (b) Sentinel-2 images from Amazon Web Services; (c) Sentinel-2, Landsat-8, and CBERS-4 images from the BrazilDataCube(BDC); (d) Landsat-8 and Sentinel-2 collections from Digital Earth Africa; (e) Landsat-5/7/8 collections from USGS.

The package provides support for the classification of time series, preserving the full temporal resolution of the input data. It supports two kinds of machine methods. The first group of methods does not explicitly consider spatial or temporal dimensions; these models treat time series as a vector in a high-dimensional feature space. From this class of models, sits includes random forests, support vector machines, extreme gradient boosting [1], and multi-layer perceptrons.

The second group of models comprises deep learning methods designed to work with image time series. Temporal relations between observed values in a time series are taken into account. The sits package supports a set of 1D-CNN algorithms: TempCNN [2], ResNet [3], and InceptionTime [4]. Models based on 1D-CNN treat each band of an image time separately. The order of the samples in the time series is relevant for the classifier. Each layer of the network applies a convolution filter to the output of the previous layer. This cascade of convolutions captures time series features in different time scales [2]. The authors have used these methods with success for classifying large areas [5, 6, 7].

As an example of our claim that *sits* can be used for land use and land cover change mapping, the paper by Simoes et al[7] describes an application of sits to produce a one-year land use and cover classification of the Cerrado biome in Brazil using Landsat-8 images. Cerrado is the second largest biome in Brazil with 1.9 million km2. The Brazilian Cerrado is a tropical savanna ecoregion with a rich ecosys.tem ranging from grasslands to woodlands. The Brazilian Cerrado is covered by 51 Landsat-8 tiles available in the Brazil Data Cube (BDC) [8]. The one-year classification period ranges from September 2017 to August 2018, following the agricultural calendar. The temporal interval is 16 days, resulting in 24 images per tile. The total input data size is about 8 TB. Training data consisted of 48,850 samples divided in 14 classes. The data set was used to train a TempCNN method [2]. After the classification, we applied Bayesian smoothing to the probability maps and then generated a labelled map by selecting the most likely class for each pixel. The classification was executed on an Ubuntu server with 24 cores and 128 GB memory. Each Landsat-8 tile was classified in an average of 30 min, and the total classification took about 24 h. The overall accuracy of the classification was 0.86.

The *sits* API provides a simple and powerful environment for land classification. Processing and handling large image collections does not require knowledge of parallel programming tools. The package provides support for deep learning models that have been tested and validated in the scientific literature and are not available in environments such as Google Earth Engine. The package is therefore an innovative contribution to big Earth observation data analysis.
The package is available on Github at The software is licensed under the GNU General Public License v2.0. Full documentation of the package is available at

[1] T. Chen and C. Guestrin, “XGBoost: A Scalable Tree Boosting System,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, (New York, NY, USA), pp. 785–794, Association for Computing Machinery, 2016.
[2] C. Pelletier, G. I. Webb, and F. Petitjean, “Temporal Convolutional Neural Network for the Classification of Satellite Image Time Series,” Remote Sensing, vol. 11, no. 5, 2019.
[3] H. I. Fawaz, G. Forestier, J. Weber, L. Idoumghar, and P.-A. Muller, “Deep for time series classification: A review,” Data Mining and Knowledge Discovery, vol. 33, no. 4, pp. 917–963, 2019.
[4] H. Fawaz, B. Lucas, G. Forestier, C. Pelletier, D. F. Schmidt, J. Weber, G. I. Webb, L. Idoumghar, P.-A. Muller, and F. Petitjean, “InceptionTime: Finding AlexNet for time series classification,” Data Mining and Knowledge Discovery, vol. 34, no. 6, pp. 1936–1962, 2020.
[5] M. Picoli, G. Camara, I. Sanches, R. Simoes, A. Carvalho, A. Maciel,
A. Coutinho, J. Esquerdo, J. Antunes, R. A. Begotti, D. Arvor, and C. Almeida, “Big earth observation time series analysis for monitoring Brazilian agriculture,” ISPRS journal of photogrammetry and remote sensing, vol. 145, pp. 328–339, 2018.
[6] M. C. A. Picoli, R. Simoes, M. Chaves, L. A. Santos, A. Sanchez, A. Soares, I. D. Sanches, K. R. Ferreira, and G. R. Queiroz, “CBERS data cube: A powerful technology for mapping and monitoring Brazilian biomes.,” in ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, vol. V-3-2020, pp. 533–539, Copernicus GmbH, 2020.
[7] R. Simoes, G. Camara, G. Queiroz, F. Souza, P. R. Andrade, L. Santos, A. Car.valho, and K. Ferreira, “Satellite Image Time Series Analysis for Big Earth Observation Data,” Remote Sensing, vol. 13, no. 13, p. 2428, 2021.
[8] K. Ferreira, G. Queiroz, G. Camara, R. Souza, L. Vinhas, R. Marujo, R. Simoes,
C. Noronha, R. Costa, J. Arcanjo, V. Gomes, and M. Zaglia, “Using Remote Sensing Images and Cloud Services on AWS to Improve Land Use and Cover Monitoring,” in LAGIRS 2020: 2020 Latin American GRSS & ISPRS Remote Sensing Conference, (Santiago, Chile), 2020.