Generating a Training Dataset for Land Cover Classification to Advance Global Development

by   Yoni Nachmany, et al.

Semantic segmentation of land cover classes is fundamental for agricultural and economic development work, from sustainable forestry to urban planning, yet existing training datasets have significant limitations. To generate an open and comprehensive training library of high resolution Earth imagery and high quality land cover classifications, public Sentinel-2 data at 10 m spatial resolution was matched with accurate GlobeLand30 labels from 2010, which were filtered by agreement with an intermediary Sentinel-2 classification at 20 m produced during atmospheric correction. Scene-level classifications were predicted by Random Forests trained on valid reflectance data and the filtered labels, and achieved over 80 Further work is required to aggregate individual scene classifications for annual labels and to test the approach in more locations, before crowdsourcing human validation. The goal is to create a sustained community-wide effort to generate image labels not only for land cover, but also very specific images for major agriculture crops across the world and other thematic categories of interest to the global development community.



There are no comments yet.


page 2

page 3


LandCoverNet: A global benchmark land cover classification training dataset

Regularly updated and accurate land cover maps are essential for monitor...

Continental-scale land cover mapping at 10 m resolution over Europe (ELC10)

Widely used European land cover maps such as CORINE are produced at medi...

High-resolution land cover change from low-resolution labels: Simple baselines for the 2021 IEEE GRSS Data Fusion Contest

We present simple algorithms for land cover change detection in the 2021...

Land Cover Mapping in Limited Labels Scenario: A Survey

Land cover mapping is essential for monitoring global environmental chan...

Feature Pyramid Network for Multi-Class Land Segmentation

Semantic segmentation is in-demand in satellite imagery processing. Beca...

Human-Machine Collaboration for Fast Land Cover Mapping

We propose incorporating human labelers in a model fine-tuning system th...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Advances in sensor technology, cloud computing, and machine learning (ML) continue to converge to accelerate innovation in the field of Earth observation (EO). In recent years, significant advancements have been made by the commercial sector in developing ML based algorithms for satellite imagery to extract intelligence on agricultural productivity, oil storage, urban structures, and maritime monitoring. These successful efforts from the commercial sector underscore the enormous potential of using ML to solve global development and humanitarian challenges. However, fundamental tools and technologies still need to be developed to drive further breakthroughs and to ensure that the Global Development Community (GDC) reaps the same benefits that the commercial marketplace is experiencing.

Radiant Earth Foundation - a non-profit organization with a mission to improve discovery, access, delivery, and application of open geospatial resources in support of the GDC - proposes to advance critical insights in support of global development and humanitarian response through integrating and exploiting the latest in satellite data analytics and information technology. Radiant Earth Foundation is developing open source datasets of labeled satellite images, which will be hosted on MLHub.Earth with a Creative Commons license. These datasets will lead to a living open image library for ML and EO. Our goal is to create a sustained, community-wide effort to generate image labels that would enable major innovations and will drive new, more targeted and timely insights supporting progress in areas such as agriculture, food security, conservation, health, land rights, urban planning, water resources, and other areas relevant to global development and humanitarian response.

This paper focuses on developing an openly available dataset of global land cover (LC) labeled imagery from Sentinel-2 satellites at 10 m spatial resolution through Radiant Earth Foundation’s platform to enable fully-automated and dynamic LC classification algorithms. The approach for labelling these images uses a combination of machine learning and crowdsourcing to generate a human-verified training dataset. Existing training datasets for LC classification have limitations that do not support development of a global EO-based LC classification algorithm at fine spatial resolutions with high accuracy. These datasets are either generated for specific regions of the world (therefore, they lack geo-diversity) or are based on imagery that are not freely available at the global scale (therefore, they are not open source) [8, 4, 6]. Moreover, in many cases, very few labeled images are available for a specific class within the dataset, which limits the performance of a ML algorithm to learn the particular features of that class.

2 Data

2.1 Satellite Imagery

Multispectral data from the constellation of Sentinel-2 satellites is publicly available through the European Space Agency (ESA). Atmospherically-corrected reflectance images are also regularly generated across Europe, but can be manually generated elsewhere. Therefore, initial experiments of this study are focused on tiles across Europe representing a variety of LC conditions. The same approach will be attempted globally following the development of an atmospheric-correction pipeline. Geo-diversity [15, 3] is an important feature of the end dataset, and the approach will be adapted to ensure high performance around the world.

To produce annual LC labels, images were collected from July 2017 to July 2018 (Sentinel-2 has a revisit rate of 5 days). Images with more than 90% cloud cover were filtered out. Our experiments included using only the four spectral bands of Sentinel-2 images with 10 m resolution (RGB and NIR, or near-infrared) as predictors, as well as combining those with the other six bands at 20 m (in red edge and SWIR, or short-wave infrared), bilinearly resampled to 10 m. Results showed that inclusion of the extra six bands improves classification of pixels with vegetation and snow.

2.2 Reference Data

Figure 1: Example of ground truth labels obtained by mapping GLC labels and filtering by S2 labels

A major challenge for predicting LC labels at the 10 m spatial resolution of Sentinel-2 was preparing labeled training data from quality existing global datasets that are coarser and older. GlobeLand30 is the first 30 m resolution global LC dataset, with 10 classes and over 80% accuracy for the year 2010 [5]. First, GlobeLand30 labels were mapped to the last level of a hierarchical LC taxonomy [13] developed by Radiant Earth Foundation’s Working Group on Machine Learning for Global Development [1]

and re-gridded using nearest neighbor interpolation to match Sentinel-2 data. Then, to better reflect Sentinel-2 imagery for the year starting in July 2017, GlobeLand30 labels were filtered by agreement with classes from Sentinel-2’s 20 m scene classifications 

[7], which are produced in the process of Level-2A atmospheric correction and have been independently validated [9]. The filtered labels are used as ground truth labels for training [Figure 1].

Figure 2: RF prediction; most of unclassified areas in ground truth data are filled with “woody vegetation“ and “artificial bare ground“
Figure 3:

Prediction probabilities, very high for “water“ and “snow/ice“ and relatively high for other classes.

3 Methodology

For land cover classification on sub-meter resolution satellite imagery, like the ISPRS 2D Semantic Labeling Contest [8]

, deep fully convolutional neural networks with encoder-decoder architectures have improved the state-of-the-art 

[2]. However, for land cover classification on Sentinel-2 imagery at 10 m spatial resolution, experts recommend ensemble methods, which are widely adopted in practice [12]. ESA’s Sentinel-2 Global Land Cover (S2GLC) project selected a pixel-based, supervised approach using Random Forests “based on classification accuracy, preservation of class details and processing efficiency” [10]. S2GLC performed tile-wise training, with a large number of samples per class, and then aggregated predictions from individual scenes over a year for annual labels. Accordingly, processing is performed at the scene-level, on pixels with valid reflectance values in areas with <90% average cloud cover (cloud confidence masks are provided during Level-2A processing). Random Forests are trained and tested on class-stratified samples of half the pixels in a scene, with one Sentinel-2 pixel at 10 m for each label pixel at 30 m. Predictions are made on all pixels marked with usable classes during Level-2A processing, including pixels labeled as ‘unclassified’ [Figure  2]. Random Forests also provide probabilities for predicted classes, which are written [Figure  3] and can be used for aggregation of a time series of predictions for a given tile [11].

Figure 4:

Normalized Confusion Matrix

4 Results

Our experiments included varying the number of trees for SciKit-Learn’s Random Forest Classifier, but the default of 10 estimators performed well, achieving 88.75% average model accuracy for the scenes from the four tiles studied. Some classes, like “water” and “snow/ice”, were predicted with high accuracy and high confidence across all scenes, which is expected given their distinct spectral signatures. Other classes, like “wetland” and “(semi) natural vegetation”, are subtler and were expected to be more difficult to classify. Within vegetated classes, “woody vegetation” and “cultivated vegetation” were predicted relatively accurately and were not confused with each other, a result of including 20 m vegetation red edge bands, resampled to 10 m. “Artificial bare ground” tended to be predicted in unclassified regions (in ground truth data), taking over areas of “natural bare ground” and “cultivated vegetation” and suggesting that traces of human activity would lead to pixels classified as “artificial bare ground” in off-vegetation season.


This study is supported by a grant from Schmidt Futures to Radiant Earth Foundation. Authors would like to thank Dr. Stanislaw Lewinski from Space Research Centre of Polish Academy of Sciences and PI of S2GLC for his recommendations. Participants of the Radiant Earth Foundation working group on “Machine Learning for Global Development“ have also contributed to this work through their extensive participation in a workshop organized on this topic in June 2018 [14].


  • [1] H. Alemohammad. Radiant.Earth Launches New Technical Working Group on Machine Learning for Global Development., 2018.
  • [2] N. Audebert, B. Le Saux, and S. Lefèvre. Semantic Segmentation of Earth Observation Data Using Multimodal and Multi-scale Deep Networks. In S. Lai, V. Lepetit, K. Nishino, and Y. Sato, editors, Computer Vision – ACCV 2016. ACCV 2016. Lecture Notes in Computer Science, pages 180–196. Springer, 2017.
  • [3] D. Bollinger. Geo-diversity for better, fairer machine learning., 2018.
  • [4] M. Campos-Taberner, A. Romero-Soriano, C. Gatta, G. Camps-Valls, A. Lagrange, B. Le Saux, A. Ere, A. Boulch, A. Chan-Hon-Tong, S. Herbin, H. Randrianarivo, M. Ferecatu, M. Shimoni, G. Moser, D. Tuia, A. Lagrange, B. Le Saux, A. Beaupére, A. Boulch, A. Chan-Hon-Tong, S. Herbin, and H. Randrianarivo. Processing of Extremely High-Resolution LiDAR and RGB Data: Outcome of the 2015 IEEE GRSS Data Fusion Contest–Part A: 2-D Contest. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 9(12), 2016.
  • [5] J. Chen, J. Chen, A. Liao, X. Cao, L. Chen, X. Chen, C. He, G. Han, S. Peng, M. Lu, W. Zhang, X. Tong, and J. Mills. Global land cover mapping at 30m resolution: A POK-based operational approach. ISPRS Journal of Photogrammetry and Remote Sensing, 103:7–27, may 2015.
  • [6] I. Demir, K. Koperski, D. Lindenbaum, G. Pang, J. Huang, S. Basu, F. Hughes, D. Tuia, and R. Raskar. DeepGlobe 2018: A Challenge to Parse the Earth through Satellite Images, may 2018.
  • [7] ESA. Sentinel-2 Level-2A Algorithm., 2018.
  • [8] ISPRS. ISPRS 2D semantic labeling dataset., 2018.
  • [9] M. Main-Knorn, B. Pflug, J. Louis, V. Debaecker, U. Müller-Wilm, and F. Gascon. Sen2Cor for Sentinel-2. In L. Bruzzone, F. Bovolo, and J. A. Benediktsson, editors, Image and Signal Processing for Remote Sensing XXIII, page 3. SPIE, oct 2017.
  • [10] R. Malinowski, A. Nowakowski, E. Kukawska, M. Rybicki, M. Krupiński, and S. Lewiński. ESA SEOM Sentinel-2 Global Land Cover (S2GLC) Abstract. Technical report, ESA, 2018.
  • [11] A. Nowakowski, M. Rybicki, E. Kukawska, R. Malinowski, M. Krupiński, and S. Lewinski. Aggregation of Sentinel-2 time series classifications as a solution for multitemporal analysis. In L. Bruzzone, F. Bovolo, and J. A. Benediktsson, editors, Image and Signal Processing for Remote Sensing XXIII, number October 2017, page 11. SPIE, oct 2017.
  • [12] Radiant Earth Foundation. Notes of the Working Group on Machine Learning Algorithms for Global Land Cover Classification., 2018.
  • [13] Radiant Earth Foundation. Radiant Earth Foundation Hierarchical LC Taxonomy., 2018.
  • [14] Radiant Earth Foundation. Radiant Earth Foundation Working Group on "Machine Learning for Global Development"., 2018.
  • [15] S. Shankar, Y. Halpern, E. Breck, J. Atwood, J. Wilson, and D. Sculley. No Classification without Representation: Assessing Geodiversity Issues in Open Data Sets for the Developing World, nov 2017.