Floods are among the most destructive extreme weather events, affecting millions of people each year . Satellite imagery is one of the most important sources of information for disaster response. Optical (visible and infrared) and synthetic aperture radar (SAR) imagery are routinely used to determine flood extent  and to help direct relief efforts.
Many countries do not have direct access to satellite imagery in the event of a disaster. To address this, organisations such as the International Charter “Space and Major Disasters”111https://disasterscharter.org, initiated by the European Space Agency (ESA), liaise with space agencies and associated commercial organisations to produce free high resolution maps for users in the field. Despite best efforts it can take many days to provide actionable data, due to satellite tasking and image analysis . Commercial organisations are able to provide the highest-frequency (daily) and highest-resolution (sub-metre) images, but these are only freely available for a limited period of time during disaster events. ESA’s Copernicus program  provides open data globally at resolution, but the optical component, Sentinel 2 (S2, ), has a worst-case revisit time of around five days at the equator. This leads to wait periods much longer than two days in areas such as Africa where other alternatives for first response are limited.
In this paper we investigate how a constellation of small inexpensive satellites assembled from COTS hardware, also known as CubeSats , could be used for disaster response, using flooding as a case study. The main advantage of CubeSats is improved revisit time through larger constellations of satellites. Around 30 CubeSats similar to ESA’s upcoming FSSCat mission  could be launched for the cost of a single S2 satellite, reducing the nominal revisit time from 5 days to around 8 hours for the same cost. However, CubeSats can have very limited downlink bandwidth, in the order of 1Mbps. In order to reduce the amount of data being transferred, we suggest performing flood mapping (an image segmentation task) on-board the satellite and only transmitting the final map.
We optimised our application for ESA’s Sat-1, part of the FSSCat mission , which is scheduled to be launched in 2020222FSSCat has been developed by ESA and other partners as a technology demonstrator.. Among other sensors, FSSCat will carry a HyperScout 2 49-band hyperspectral camera ( ground sample distance) which features an Intel®7, 8]. Using this capability, a 2-bit flood map (up to 4 classes) would reduce the amount of data being downlinked by a factor of 100 (assuming 49 16-bit channels).
The contributions of this paper are as follows:
We introduce a new dataset, called WorldFloods, containing pairs of Sentinel-2 images and flood extent maps covering 159 global flood events.
We train convolutional neural networks for flood segmentation and compare their performance to standard baseline methods like NDWI.
We design our models to process large volumes of hyperspectral data, yet fit the constraints of hardware deployed on the satellite, and report test results on such hardware.
2 Flood mapping and related work
Water mapping, of which flood mapping is a special case, is a semantic segmentation task that has been studied for decades. A simple approach to water mapping is to compute indices like the Normalised Difference Water Index (NDWI)  which exploits the strong absorption of light by water bodies in the green and infrared part of the electromagnetic spectrum. However, this method can perform poorly because the spectral profile of flood water varies widely due to the presence of debris, pollutants and suspended sediments .
More sophisticated segmentation techniques include rule-based classifiers
which use fixed or tuned threshold on indices; classical supervised machine learning
; and recently deep learning[18, 13]. State-of-the-art results in image segmentation are now routinely achieved using fully convolutional neural networks (FCNNs) . Most segmentation networks can be described as encoder-decoder architectures and include the popular U-Net  among many others .
3 World Floods dataset
The development and evaluation of flooding response systems has been constrained so far by use of datasets of limited geographical scope, with studies often only considering a single flood event . It is unclear whether such models would accurately generalise to the rest of the world due to variations in topography and landcover. To address this we collated a new global dataset called WorldFloods, which we believe is the largest of its kind.
WorldFloods contains 564 flood extent maps created either manually, or semi-automatically, where a human validated machine-generated maps. The dataset covers 159 floods that occurred between November 2015 and March 2019. We sourced all maps from three organisations: the Copernicus Emergency Management Service (Copernicus EMS) , the flood portal of UNOSAT , and the Global Flood Inundation Map Repository (GLOFIMR) . The geographical distribution of flood maps is shown in Figure 1.
For each flood event we provide the raw 13-band S2 image closest in time after the event, and a rasterised segmentation ground truth (cloud, water and land) at resolution. We generated cloud masks using s2cloudless.333https://github.com/sentinel-hub/sentinel2-cloud-detector We manually validated the data to account for gross errors such as missing water bodies or invalid intensities. Sources of error include label noise and temporal misalignment, e.g. the closest S2 image may have been acquired 5–6 days after the map was produced. While the labels in the training set may be noisy, we wanted to ensure that the test set provides a fairly clean measurement of the true performance of our system. In this direction, we manually selected test images from flood extent maps that were derived from S2 images and had no misalignment. To avoid data leakage, we did not include test set countries in the training set. Additionally, our test set does not contain any flood maps from events in the training and validation sets. Table 1
shows the train/validation/test statistics.
|Dataset||Flood events||Flood maps||256x256 patches||Water/flood pixels (%)||Land pixels (%)||Cloud pixels (%)||Invalid pixels (%)|
In order to demonstrate that a FCNN-based flood detection model can segment floods accurately and could be deployed on Sat-1, we first train FCNN models on WorldFloods at its original resolution (). We then train models on degraded imagery, mimicking the resolution of HyperScout-2 (80 m). We also verify our trained (degraded) models can be run on a Intel® Movidius™ Myriad™ 2 and measure the processing speed.
We focus on the segmentation accuracy of the water/flood class by measuring precision, recall and the Intersection over Union (IoU). Since missing flooded areas (false negatives) is more problematic than over-predicting floods (false positives), high recall is preferred to high precision. Provided the recall was over 95%, we found that the IoU was a good compromise.
As baselines, we use NDWI (S2 band 2 and 8) and a linear model (all S2 bands) trained on WorldFloods. A range of NDWI thresholds have been suggested [14, 16, 15]. In order to set a stronger baseline, we also compute results where the threshold is tuned for each image, representing the absolute best case performance for the NDWI. We compare our baselines to two FCNNs: a simple CNN (SCNN) comprising four convolutional layers (0.26M parameters) and a U-Net (7.8M parameters, ).
Models were trained from scratch for 40 epochs using all 13 S2 bands with input patches of size 256x256 fordata or 64x64 for
data (2.5 km x 2.5 km). In order to achieve models with high recall we used a cross-entropy loss function that weights each class by the inverse of the observed frequency in Table1, combined with a Dice loss . Augmentation was applied during training including flips and rotations, per-channel jitter, Poisson noise and brightness/contrast adjustments.
|NDWI (thresh 0)||NDWI (tuned)||Linear||SCNN||U-Net|
Table 2 shows the IoU and recall for the different models and baselines. Our three models (Linear, SCNN and UNet) all have a recall above 95% whereas the tuned NDWI has a recall of 93.5%; NDWI without tuning generalises poorly, we suspect due to muddy water. FCNN models performed best although there was only a small increase in performance between SCNN and U-Net, despite U-Net having 30x more parameters. The drop in performance from to is around 2 points for FCNN models which is acceptable taking into account that the spatial resolution is 8 times worse. Figure 3
shows the precision and recall for different thresholds on the water/flood class; again, our trained models beat NDWI and larger models tend to perform better.
The SCNN model was selected to be tested on the Myriad 2 chip due to its lower computational footprint compared to UNet (1FLOPS vs 2.68FLOPS for a 64x64x13 input). Figure 2 shows the images segmented using the Myriad 2. In general, the model over-predicts water content. False positives are mostly clustered in the surroundings of water bodies and in cloud shadows. This model segments a 12Mpx image approximately the size acquired by HyperScout-2 in less than one minute.
We have demonstrated that accurate flood segmentation is feasible to perform using low resolution images in orbit using available hardware. Our models outperform standard baselines and are favourably comparable to human annotation, while being efficiently computable on-board with current hardware. We are currently performing a more rigorous hyper-parameter search over a larger number of models and we hope to release both pipeline code and the WorldFloods dataset shortly, which we hope will serve as a useful tool to foster research in disaster response.
This research was conducted at the Frontier Development Lab (FDL), Europe. The authors gratefully acknowledge support from the European Space Agency, Google Inc., Kellogg College, University of Oxford and other organisations and mentors who supported FDL Europe 2019. We would like to specially thank Guy Schumann and Yarin Gal for their support and feedback during the development of this work. Gonzalo Mateo-Garcia has been partially supported by the Spanish Ministry of Science, Innovation and Universities (MINECO, TEC2016-77741-R, ERDF) and the European Social Fund.
-  (2012) ESA’s Sentinel missions in support of Earth system science. Remote Sensing of Environment 120, pp. 84–90. Cited by: §1.
-  (2018) FSSCAT, the 2017 Copernicus Masters’“ESA Sentinel Small Satellite Challenge” Winner: A Federated Polar and Soil Moisture Tandem Mission Based on 6U Cubesats. In IGARSS 2018-2018 IEEE International Geoscience and Remote Sensing Symposium, pp. 8285–8287. Cited by: §1.
-  (2015) The human cost of weather-related disasters 1995-2015. United Nations Office for Disaster Risk Reduction. Cited by: §1.
-  (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European conference on computer vision (ECCV), pp. 801–818. Cited by: §2.
-  (2019) Copernicus Emergency Management System. Note: https://emergency.copernicus.eu/Accessed: 2019-09-15 Cited by: §3.
-  (2012) Sentinel-2: ESA’s optical high-resolution mission for GMES operational services. Remote sensing of Environment 120, pp. 25–36. Cited by: §1.
In-orbit demonstration of artificial intelligence applied to hyperspectral and thermal sensing from space. In CubeSats and SmallSats for Remote Sensing III, Vol. 11131, pp. 111310C. Cited by: §1, §1.
-  (2019) HyperScout 2 highly integration of hyperspectral and thermal infrared technologies for a miniaturized EO imager. In Living Planet Symposium, Cited by: §1.
-  (2017) A review on deep learning techniques applied to semantic segmentation. arXiv preprint arXiv:1704.06857. Cited by: §2.
-  (2019) Global Flood Inundation Map Repository. Note: https://sdml.ua.edu/glofimr/Accessed: 2019-09-15 Cited by: §3.
-  (2017) E2mC: improving emergency management service practice through social media and crowdsourcing analysis in near real time. Sensors 17 (12), pp. 2766. Cited by: §1.
-  (2000) CubeSat: A new generation of picosatellite for education and industry low-cost space experimentation. In 14th Annual/USU Conference on Small Satellites, Cited by: §1.
-  (2017) Surface water mapping by deep learning. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 10 (11), pp. 4909–4918. Cited by: §2.
-  (1996) The use of the normalized difference water index (ndwi) in the delineation of open water features. International Journal of Remote Sensing 17 (7), pp. 1425–1432. External Links: Cited by: §2, §4.
-  (2013) Using the Normalized Difference Water Index (NDWI) within a Geographic Information System to Detect Swimming Pools for Mosquito Abatement: A Practical Approach. Remote Sensing 5 (7), pp. 3544–3561. External Links: Cited by: §4.
-  (2015) Flood monitoring and damage assessment using water indices: a case study of pakistan flood-2012. The Egyptian Journal of Remote Sensing and Space Science 18 (1), pp. 99 – 106. External Links: Cited by: §2, §2, §4.
-  (2015) U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 234–241. Cited by: §2, §4.
-  (2019-07) Multi3Net: segmenting flooded buildings via fusion of multiresolution, multisensor, and multitemporal satellite imagery. Proceedings of the AAAI Conference on Artificial Intelligence 33, pp. 702–709. Cited by: §2.
-  (2019) The need for scientific rigour and accountability in flood mapping to better support disaster response. Hydrological Processes 1 (5). Cited by: §3.
-  (2012-10) Information extraction from remote sensing images for flood monitoring and damage evaluation. Proceedings of the IEEE 100 (10), pp. 2946–2970. External Links: Cited by: §1, §2.
-  (2017) Generalised dice overlap as a deep learning loss function for highly unbalanced segmentations. In Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, pp. 240–248. Cited by: §4.
-  (2019) UNOSAT. Note: http://floods.unosat.org/geoportal/catalog/main/home.pageAccessed: 2019-09-15 Cited by: §3.