In recent years, a new literature has emerged to combine computer vision methods with satellite imagery and apply these to important challenges in developing country agriculture. One set of studies focuses on the potential to use computer vision as a less expensive and more accurate substitute for traditional forms of data collection such as crop cutting and surveys. Burke and Lobell (2017) show that satellite images are as accurate as survey data in identifying variation in agricultural yields for smallholder farmers. Rustowicz et al. (2019)
provide the first crop type semantic segmentation dataset for smallholder farms in Africa, with labeled data from South Sudan and Ghana, and train a neural network that outperforms the previous state of the art.Kamilaris and Prenafeta-Boldú (2018)
review the use of convolutional neural networks (CNNs) in agricultural applications, and note that while CNNs often outperform other methods, data quality is critical to success. Our study also builds on a growing literature applying computer vision methods to aerial data more generally.Maggiori et al. (2016) introduce a CNN-based framework for dense labeling of remote-sensing derived images, and Mnih and Hinton (2012)
find that using an appropriately chosen loss function can substantially improve results when using aerial data with noisy labels.
Weather risk is one of the most important challenges affecting developing country farmers, and it is especially serious for the pastoralists of Northern Kenya. Droughts frequently kill large numbers of livestock, which are the primary assets owned by pastoralist households and a key source of nutrition. Worse, climate change is expected to increase drought frequency and severity Seneviratne et al. (2017)
. Index-based insurance is designed to reduce the cost of droughts by using some measurement (the index) to predict average losses in a given region. When average losses reach a threshold, index insurance products provide farmers or pastoralists with cash to offset their losses. Early index insurance products used rainfall monitors and crop cuttings to estimate average losses; more recent products have relied on remote sensing methods and satellite data. In the region studied in this paper, the International Livestock Research Institute’s (ILRI) Index Based Livestock Insurance (IBLI) program has been in operation since 2011(Chantarat et al., 2013). It relies on an index derived from the Normalized Differenced Vegetation Index (NDVI), a remote sensing-based measure that has been shown to accurately capture changes in green vegetation.
As studied in detail by Jensen et al. (2019), index accuracy and cost are important determinants of the value of insurance for farmers. This study explores whether a computer vision-based approach can outperform the NDVI-based approach currently used for the IBLI program in Northern Kenya. Index accuracy plays a strong role in determining the value of index insurance to farmers, and poor accuracy is a major factor limiting farmer adoption of insurance (Carter et al., 2017). Our results demonstrate that computer vision methods can generate more accurate indices for insurance purposes than existing methods and provide proof-of-concept for a machine learning-based insurance product.
The dataset relies on two sources: 1) ground-based labels and photos from pastoralists assessing forage conditions on site and 2) satellite images from NASA’s LANDSAT mission.111In recent years, much higher resolution satellite images have become available from companies like Planet and Digital Globe. This paper uses 30m resolution data freely available from LANDSAT because higher resolution images were not available during the time period when the ground-level labels were generated. We expect that even better results will be possible using higher-resolution images in the future.
We select satellite images centered on the geolocations for which we have labeled ground-level data. This approach yields a supervised labeling of forage quality on satellite images, and it is highly generalizable. In any cases where ground-level data on an agricultural outcome is available (e.g. crop yields, disease status), this method can be used to develop a satellite-based index. Furthermore, it enables model ensembling or transfer learning across countries or indices (e.g. yields of related crops), which could improve model robustness and reduce the need for new supervised labels.
2.1 Ground-level Labels
A core challenge in applying computer vision methods to satellite data, particularly in developing countries, is the need for large numbers of accurate labels. Generating these labels is especially difficult because human experts cannot explicitly label aerial data with outcomes like poverty level or forage quality. A number of creative approaches have been used to circumvent this problem. Sheehan et al. (2019) combine georeferenced Wikipedia entries with satellite images to predict community-level asset wealth and education, and Jean et al. (2016) and Tingzon et al. (2019) use nighttime lights data to train models that estimate consumption expenditure and asset wealth.
We tackle the labeling problem by leveraging a set of labels generated on the ground by pastoralist nomads–experts in assessing forage quality in the region of Northern Kenya (data collected by ILRI in 2015). At each reporting point, pastoralists were asked to estimate the number of cows that could eat for a single day within 20 meters of their location on a scale of 0-3+. Labels are discrete, so pastoralists must pick either 0, 1, 2, or 3+ based on their best estimate.
2.2 Ground-level Images
At each labeled location, pastoralist data collectors also took a photo of the surroundings from their vantage point. We used a subset of these photos to validate the initial pastoralist estimates through Amazon Mechanical Turk. They are included with the dataset as a potential tool for checking data quality. This verified subset of images also forms the validation set for the benchmark.
2.3 Satellite Images
For each labeled location, we downloaded a 65x65 pixel satellite image from within one week of the date on which the label was generated. When images were not available that met this criterion, the observation was dropped. The resulting dataset contains over 100,000 labeled images.
LANDSAT images are relatively low spatial resolution: each pixel represents a 30 meter square. However, each pixel contains more data than in typical images: in addition to the standard 3 RGB bands, LANDSAT contains 8 other bands with information outside of the visible spectrum. The near-infrared band is of particular interest, because it is known to be helpful in measuring vegetation and is an important part of the index currently used in the IBLI program.
3 Collaborative Benchmark
The data described above are part of an ongoing collaborative benchmark hosted by Weights & Biases222See https://www.wandb.com/articles/droughtwatch to learn more about the benchmark. The platform differs from many others in the sense that all participants can share their code and training workflows, discuss their approaches, and potentially build on each others’ findings. This collaborative structure is designed to promote faster progress and knowledge sharing across participants (Biewald, 2020).
The highest accuracy reported at the time of writing is 77.8%. The model is based on the EfficientNet architecture proposed by Tan and Le (2019).333Details on this as well as other leading models can be examined at https://app.wandb.ai/wandb/droughtwatch/benchmark/leaderboard
This model slightly outperformed several ResNet-based models and a tuned version of the CNN baseline. The CNN baseline is an intentionally simple example for the benchmark: three convolutional blocks with max pooling, dropout on blocks 2 and 3, and two fully-connected layers before the softmax. The layer sizes and hyperparameters are easily configurable to encourage participants to explore model variants. All of the models in the leaderboad significantly outperform an NDVI-based model, which achieved a predictive accuracy of just 67%.
Since the difference between a basic CNN and a ResNet is only a few percent accuracy, there are clear opportunities for improvement: tuning different network architectures, loss functions, optimizers, and other hyperparameters. On the data side, promising next steps include a finer-grain analysis of the spectral bands (of the 11, a few seem to add more noise than signal, such that removing them improves accuracy), filtering out images with obscuring clouds, data augmentation (e.g. rotate, flip), and addressing the class imbalance (roughly 60% of the data is of class 0, classes 1 and 2 have 15% each, and the remaining 10% is class 3+). The benchmark collaboration will continue to support and cross-pollinate these model improvements.
In this section we explore several extensions to the modeling. First, we explore how much predictive accuracy is lost when the images are center-cropped, reducing context. There is little loss of accuracy even when cropping to 35x35 pixel rather than 65x65 pixel images. Second, we explore reframing the problem as an ordinal regression problem to ensure that predictions are close even if not perfect.
4.1 Image Size
Because each pixel is 30 meters across, the 65x65 pixel images in the dataset represent nearly 1 kilometer squares. While spatial context likely helps generate better predictions, it is intuitive that these images may be larger than necessary. Sensitivity tests indicate that networks trained on much smaller squares are able to perform similarly well: prediction accuracy did not decrease substantially in our tests until images were smaller than 35x35. Smaller images also reduce training and prediction time (since new image data must be downloaded from a server).
The benchmark and initial analysis relied on standard neural network architecture for categorical outputs: labels encoded as one-hot vectors and the final layer as a softmax activating at most one neuron in the output layer. A prediction is accurate only if it is exactly correct, which means that mistakenly categorizing a ‘3’ as a ‘0’ is scored identically to categorizing a ‘2’ as a ‘1’, even though the second is intuitively a smaller error.
Ordinality is especially important for this problem for two reasons. First, there is inherent subjectivity and uncertainty in the labeling process. Two expert pastoralists could look at the same 20 meter circle, and one might estimate it could support one cow while the other would estimate two. It’s much less likely that one would say ‘3+’ and the other ‘0.’ Second, for effective index insurance it is not essential that we have an exactly correct estimate of conditions at each point in space and time, but it is essential that our index only make small errors. While a small error might lead the insurance product to slightly underpay in the event of a drought, a large error might lead to a complete failure to pay, defeating the purpose of the product.
We follow the approach described by Cheng et al. (2008)
, replacing the one-hot vectors with a format that preserves information on ordinality within the sample. For example, a location that can support 0 cattle is encoded [0,0,0], 1 is encoded as [1,0,0], 2 as [1,1,0], and 3+ as [1,1,1]. Each entry in the output vector can be thought of as predicting whether the carrying capacity is greater than or equal to some value: a ‘1’ in the first position of the array means that the location can support one or more animals; a ‘1’ in the second position means that it can support two or more, etc. Rather than a softmax, the final layer uses a sigmoid activation function to activate more than one output.
Ordinal accuracy can be evaluated with a binary cross-entropy loss function and binary accuracy to calculate the share of matching outputs between the prediction and target. In other words, a ‘2’ categorized as a ‘1’ receives an accuracy of 66.66%, while a ‘3’ categorized as a ‘0’ receives an accuracy of 0% . Scored this way, our ordinal model achieves a validation accuracy of 87.88%. Because of the difference in scoring and loss functions, the accuracy from the ordinal regressions is not directly comparable to the categorical models. However, evaluating the best categorical model with ordinal accuracy would only improve the metric: some of the errors previously counted as 0% would add 33.33% or 66.66%, depending on the distance between the labels.
Computer vision methods have great promise as a means of shielding farmers from the negative effects of climate change due to their ability to flexibly map satellite data to information on yields, forage quality, or other measures collected from the ground. This paper provides an initial dataset for exploring the potential of computer vision for index insurance and proof of concept that computer vision methods can outperform existing remote sensing techniques. Once trained and deployed, computer vision models can assess conditions on demand, making it possible to monitor more frequently and compensate the insured more quickly. By increasing both the accuracy and the speed of insurance payments, computer vision has the potential to substantially increase the benefit of insurance for farmers.
We would like to thank Nathan Jensen at the International Livestock Research Institute for making the forage quality data we describe in this paper available for the benchmark competition we describe in this paper.
The forage quality data used in this paper was collected through a research collaboration between the International Livestock Research Institute, Cornell University, and UC San Diego. It was supported by the Atkinson Centre for a Sustainable Future’s Academic Venture Fund, Australian Aid through the AusAID Development Research Awards Scheme Agreement No. 66138, the National Science Foundation (0832782, 1059284, 1522054), and ARO grant W911-NF-14-1-0498.
- Experiment tracking with weights and biases. Note: Software available from wandb.com External Links: Cited by: §3.
- Satellite-based assessment of yield variation and its determinants in smallholder african systems. Proceedings of the National Academy of Sciences 114 (9), pp. 2189–2194. Cited by: §1.
- Index insurance for developing country agriculture: a reassessment. Annual Review of Resource Economics 9, pp. 421–438. Cited by: §1.
- Designing index-based livestock insurance for managing asset risk in northern kenya. Journal of Risk and Insurance 80 (1), pp. 205–237. Cited by: §1.
- A neural network approach to ordinal regression. In 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), pp. 1279–1284. Cited by: §4.2.
- Combining satellite imagery and machine learning to predict poverty. Science 353 (6301), pp. 790–794. Cited by: §2.1.
- Does the design matter? comparing satellite-based indices for insuring pastoralists against drought. Ecological economics 162, pp. 59–73. Cited by: §1.
- A review of the use of convolutional neural networks in agriculture. The Journal of Agricultural Science 156 (3), pp. 312–322. Cited by: §1.
- Convolutional neural networks for large-scale remote-sensing image classification. IEEE Transactions on Geoscience and Remote Sensing 55 (2), pp. 645–657. Cited by: §1.
- Learning to label aerial images from noisy data. In Proceedings of the 29th International conference on machine learning (ICML-12), pp. 567–574. Cited by: §1.
Semantic segmentation of crop type in africa: a novel dataset and analysis of deep learning methods. In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 75–82. Cited by: §1.
- Changes in climate extremes and their impacts on the natural physical environment. Cited by: §1.
- Predicting economic development using geolocated wikipedia articles. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2698–2706. Cited by: §2.1.
- EfficientNet: rethinking model scaling for convolutional neural networks. In International Conference on Machine Learning, pp. 6105–6114. Cited by: §3.
- Mapping poverty in the philippines using machine learning, satellite imagery, and crowd-sourced geospatial information. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences 42 (4/W19). Cited by: §2.1.