1 Introduction
Floods are among the most common and most deadly natural disasters in the world, affecting hundreds of millions of people and causing thousands of fatalities annually Doocy et al. (2013). Flood forecasting systems are an effective mitigation tool, reducing fatalities and economic costs by about a third Malilay (1997).
The spatial accuracy of alerts is vital both in allowing governments to plan effective mitigation and relief efforts, and in providing actionable information to individuals. Unfortunately, the vast majority of those exposed to flooding, and an even larger portion of fatalities, are located in regions that do not have access to spatially accurate forecasting systems Hirpa et al. (2016).
This paper describes an operational system providing highresolution, highaccuracy flood forecasts for riverine floods. The system covers an area of 11,600 km along the Ganges, Brahmaputra, and Ghaghara rivers in India. Throughout the 2019 monsoon season (July–October), the system generated over 170 inundation forecast maps for multiple, continuouslyevolving severe floods. These maps were delivered in real time, both to government agencies (through dedicated communication channels) and to individuals (via Android notifications and other Google products).
At a high level, our system receives a measurement or forecast of river water level, and predicts the extent of the resulting flood. The system consists of the following components. First, we generate highresolution elevation maps. These are critical for the accuracy of any inundation model, but producing these globally and costeffectively is challenging. Second, we obtain realtime measurements and forecasts of river water levels, measured either using a stream gauge or a hydrological model. We currently rely on government agencies to obtain these data. Third, we perform hydraulic modeling of the river and floodplain, combining physicsbased and MLbased solutions to obtain sufficient prediction accuracy. Finally, we disseminate the forecasts to the affected population. To simplify interpretation of the forecast, we generate a map containing three regions, representing three levels of risk: “Some risk,” “Higher risk,” and “Highest risk.”
The contents of this paper describe our current operational system, as well as concrete efforts we are actively working on. In the process of scaling our system, we will need to tackle new climates, ground conditions, and other obstacles that will require additional innovations. Nevertheless, we believe the concepts presented in this paper are first steps towards a truly planetscale system, which is required to address the thousands of unnecessary deaths floods cause every year.
2 Digital elevation model generation
Lack of a publicly accessible, global, high resolution digital elevation model (DEM)^{1}^{1}1In this paper, digital surface models (DSMs) include groundcover such as trees and buildings; digital terrain models (DTMs) approximate “bare earth,” with groundcover removed; and DEM is an umbrella term that encompasses both. is a critical limiting factor in flood modeling. The most widely used DEMs are NASA’s SRTM Laboratory and JAXA’s AW3D30 Japan Aerospace Exploration Agency (2019), both of which are freely available at 30 meter horizontal resolution. These DEMs have significant vertical error Yamazaki et al. (2017) and are aging: the data for SRTM and AW3D30 was gathered in 2000 and during 2006–2011, respectively.
Despite their limitations, these DEMs are used by hydrologists because the commercial alternatives are prohibitively expensive. We have been unable to build accurate flood models using publicly accessible DEMs. Instead, we repurpose existing satellite images to construct highresolution DEMs.
2.1 DEM generation challenges
Reconstructing DEMs from generalpurpose satellite images is a particularly challenging variation of the photogrammetric bundle adjustment problem Triggs et al. (2000), where the corpus is:

Multisensor: We fuse data from many satellites with different image resolutions, frequency response curves, types of optical distortion, and spatial errors.

Multitemporal: Our images were captured over months or years. This makes registration challenging, as the images may differ substantially, e.g., due to seasonal variations. Furthermore, the topography itself changes over time (e.g., due to riverine erosion), and consequently older images must sometimes be discarded in favor of newer ones.

Ad hoc: To create DEMs, operators carefully plan the rotational motion of the satellite to obtain multiple “stereo” views of the target from different angles. In our case, the viewing directions are effectively random.
2.2 Camera model correction and DSM generation
Our system first gathers all satellite images that may intersect a given region. Then, in a coarsetofine manner, it simultaneously solves for surface elevations and adjustments to the images’ camera models. Due to the diversity of satellite inputs, we have found it more robust to use dense digital surface model (DSM) estimation combined with patchbased crosscorrelation alignment than to use the more typical sparse 3D estimation from feature matching. Each correlated pair of patches provides a local translation. Combining many local translations in an iterative solver leads to general spatiallyvarying alignment fields. The crosscorrelation process generates confidence values that allow for spatiallyadaptive regularization of the alignment fields.
We then hold the camera models fixed and use standard multiview stereo methods to estimate a depth map for each image. To create the DSM, we merge the depth maps by determining, at each horizontal location, the elevation yielding the greatest agreement. Of course, some depth maps may disagree due to changes over time, clouds, etc. We use a graphcut solver to find the best consensus elevation. Both the stereo and depth map merging steps are performed in a coarsetofine manner for computational efficiency.
2.3 DTM generation
Since water flows around trees, digital surface models (DSMs) can distort hydraulic simulations. For example, a row of trees planted as a windbreak can, in a simulation, erroneously behave like a dam or levee. One way to create a bare earth digital terrain model (DTM) is to start with a DSM (plus projected images) and train a convolutional neural network or other ML model to identify where the DSM and DTM differ. Trees and buildings may then be removed and filled with a smoothly interpolated estimate of the ground. Of course, in the case of large, continuous forests, additional information, such as canopypenetrating radar or treeheight surveys, would be needed to accurately pick the right ground elevation. The ideal solution for flood modeling is a hybrid between a DSM and DTM that excludes trees but retains the manmade structures that can affect water flow.
2.4 River segmentation
Because images cannot be correlated and aligned over water, we segment river pixels and replace the river DEM with a constantslope riverbed. Manually creating this riverbed mask for large areas is timeconsuming and expensive, and we are working to automate this using MLbased segmentation. In order to create training data for this task, we have trained humans to mark out regions in a satellite image corresponding to riverbed and clouds. The images are labeled at a resolution of per pixel. We currently have labeled pixels, which are used for training and validation. The DeepLabv3+ Chen et al. (2018) model is trained to segment the riverbed and clouds using this data. A preinitialized Xception network Chollet (2017) trained on the COCO dataset Lin et al. (2014) is used as the encoder backbone of the model. Our early qualitative results look promising (see Figure 1), and we are working towards quantifying these metrics before using this system in production.
3 Inundation forecasting
3.1 Hydraulic modeling
Hydraulic models simulate the flow of water in a given topography. When properly calibrated, hydraulic models are capable of accurately predicting inundation extent and water depth. The classical approach is to timeintegrate the SaintVenant shallow water partial differential equations (PDEs). Ignoring the negligible advection term, the PDEs are given by
Vreugdenhil (1994); de Almeida et al. (2012)(1) 
where is the flux, is the water height, is the surface elevation (topography), is the gravitational acceleration constant, and is the Manning friction coefficient. The first equation in (1) represents conservation of mass: when water flows into a cell, the water height rises. The second equation represents Newton’s second law: water accelerates (its flux increases) due to gravity when there are differences in water height, and decelerates due to friction.
Our implementation is based on the numerical scheme of de Almeida et al. (2012) for solving the inertial form of the SaintVenant equations. The region of interest is represented as a 2D grid with cell sizes around 10 m.
To integrate (1), one must provide boundary conditions, including the incoming discharge. Unfortunately, discharge is difficult to measure directly, and such measurements are not available in realtime for most regions in India. Instead, we rely only on river water level measurements and forecasts, which are more commonly available.
To utilize river water level forecasts, we assume that the river level changes slowly, so that the flood extent at any given moment is approximated by a simulation with constant influx at
. This is a good approximation for large rivers, which are the current focus of our floods warning system: In such rivers, water level changes relatively slowly due to the large amounts of water involved. We select numerous different boundary conditions and simulate each until it approaches a steady state. The inundation extent at steady state is recorded and indexed by the simulated water level at the stream gauge location.Upon receiving a river gauge forecast , one may retrieve the simulation whose water level at the gauge location most closely matches , thus obtaining a flood extent forecast while circumventing the need for a discharge measurement.^{2}^{2}2In effect, this procedure incorporates the estimation of a rating curve—the function relating water level to discharge—into the simulation.
To account for measurement uncertainty and forecasting errors, we add normallydistributed random noise to the forecast water level. This induces a probability distribution over simulated flood extents, which can be interpreted as an inundation probability value for each grid cell. These probabilities are discretized to three levels and rendered to users as three regions, respectively marked Some Risk, Higher Risk, and Highest Risk.
3.2 Computational complexity
A major challenge in hydraulic modeling is computational complexity. We are implementing several techniques to address this challenge. One approach is parallelization of the PDE solver through domain decomposition Smith (1997). But even on multiple CPU cores, processing time quickly becomes prohibitive as the resolution and coverage area increase.
In recent years, a new class of AI accelerators has emerged, primarily to target largescale deep learning. Google’s Cloud Tensor Processing Unit (TPU)
Jouppi et al. (2017) is one such example. The Cloud TPU v3 offers floatingpoint operations per second (FLOPS) and 128 GB of high bandwidth memory (HBM), and multiple cores can be connected through a highspeed toroidal mesh network into a “pod” that achieves 100+ petaFLOPS and 32TB of HBM.Despite having originally been designed specifically for deep learning, TPUs have successfully been applied to nonML tasks that conventionally require specialized highperformance computing resources Yang et al. (2019). TPU pods constitute a class of interconnected accelerators particularly wellsuited for scaling to largescale, highresolution grids.
To utilize these capabilities, we designed a TensorFlow PDE solver which facilitates programming distributed algorithms on TPUs in easytoexpress Python code. The finite difference computation is distributed across multiple cores following the SIMD paradigm to perform domain decomposition
Smith (1997). Finite difference computation is efficient owing to the finite difference kernel being expressed as a 1D convolution, which is computed using TPU matrix multiplier unit (MXU) passes. The current implementation gains an 85x speedup using a single TPU core relative to our reference single CPU core implementation.We are also exploring the potential of blending ML algorithms into traditional numerical models. For example, recent work showed that modeling features with neural networks can allow for solving wavelike PDEs on much coarser grids than is possible with standard numerical methods BarSinai et al. (2019).
3.3 Predictive inundation model
ML can complement the hydraulic simulation approach described above, by training a model on past flood events to directly infer the inundation extent from water level measurements. To this end, we used satellite imagery, specifically the synthetic aperture radar ground range detected (SAR GRD) data from the Sentinel1 satellite, from which we determine the inundation extent at known timepoints Schumann and Moller (2015). At any given region, a SAR image is available once every several days. We matched this data with historical water level measurements, and restricted attention to measurements exceeding a predefined flood warning level. This resulted in a total of 10–40 SAR measurements per location, based on data available since 2014, the year Sentinel1 became operational.
Using this data, we trained a model to predict, for each pixel, whether it would be wet or dry given the value of the nearest water level measurement. We assumed for simplicity that any given pixel will be wet for measurement values above some threshold
, and dry otherwise. The model is thus a binary classifier per pixel, and a threshold may be chosen from the historical training data to balance precision and recall. We chose two separate thresholds: a recalloriented threshold generating a Some Risk region, and a precisionoriented threshold yielding a Highest Risk region. These predictions are combined with the hydraulic simulation, as described in Section
3.4.3.4 Deployment in an operational setting
To deploy the warning system in a new region, several sources of data are required. First, satellite imagery is used to generate a highresolution DEM (Section 2). Next, we obtain realtime hydrological forecasts for a stream gauge in the given site: our current system receives water level forecasts from the Indian Central Water Commission. Finally, we collect observations of past flood extents from processed satellite SAR imagery Schumann and Moller (2015).
This data allows the construction of the hydraulic model (Section 3.1) and the predictive inundation model (Section 3.3). As these models tend to complement each other, they were combined as follows: The final Some Risk region was defined as the union of the Some Risk regions of the two models, and the Highest Risk region was defined as the intersection of the Highest Risk region of the two models. This tends to improve the performance relative to each independent model, without significantly changing the areas of the various risk regions.
In real time, upon receiving a river level measurement, we construct the corresponding risk map, and provide it to individuals and government agencies. To maximize penetration, individuals can view such warning maps through a number of Google products, including Android notifications to people in affected areas, Google Search results, and Google Maps.
4 Metrics and results
To measure the endtoend accuracy of our system, we compare our forecast maps to the ground truth inundation maps for historical flood events, which are based on SAR imagery (see Section 3.3). The forecast “Some risk” region is geared toward recall, while the “Highest risk” region is geared toward precision. To measure our accuracy in a manner that optimizes for this asymmetry, we use the following metrics:
Some risk recall (SRR): This is defined as the number of pixels correctly detected as wet in the “Some risk” region, divided by the total number of wet pixels in the ground truth.
Highest risk precision (HRP): This is defined as the number of pixels correctly detected as wet in the “Highest risk” region, divided by the total number of predicted wet in the “Highest risk” region.
Risk area ratio (RAR): This is defined as the ratio between the area of the “Some risk” region and the “Highest risk” region. This metric is an indicator of the uncertainty expressed by the differences between our regions.
Over the course of the 2019 monsoon season, our system achieved an average of 98.6% SRR, 79.9% HRP, and 2.81 RAR. Two typical forecast maps, both comparing the “Highest risk” regions with ground truth, are shown in Figure 2.
Acknowledgments
This paper describes a large system which is the result of joint work of many people. We would like to particularly thank the following people for their contributions: Jamie Adams, Brett Allen, John Anderson, YiFan Chen, Malo Denielou, Mark Duchaineau, Ran ElYaniv, Anton Geraschenko, Pete Giencke, Yotam Gigi, Aleksey Golovinskiy, Avinatan Hassidim, Zhuoliang Kang, Adi Mano, Yossi Matias, Paul Merrill, Damien Pierce, Slava Salasin, Guy Shalev, Rhett Stucki, Ajai Tirumali, and Oleg Zlydenko.
References
 Learning data driven discretizations for partial differential equations. Proc. Nat. Acad. Sci. 116 (31), pp. 15344–15349. External Links: Document, Link Cited by: §3.2.
 Encoderdecoder with atrous separable convolution for semantic image segmentation. In Proc. European Conf. Comp. Vision (ECCV), pp. 801–818. Cited by: §2.4.
 Xception: deep learning with depthwise separable convolutions. In Proc. IEEE Conf. Comp. Vision and Pattern Recog. (CVPR), pp. 1251–1258. Cited by: §2.4.
 Improving the stability of a simple formulation of the shallow water equations for 2D flood modeling. Water Resources Research 48 (5). Cited by: §3.1, §3.1.
 The human impact of earthquakes: a historical review of events 1980–2009 and systematic literature review. PLoS Currents 5. Cited by: §1.
 Saving lives: ensemblebased early warnings in developing nations. Handbook of Hydrometeorological Ensemble Forecasting, pp. 1–22. Cited by: §1.
 ALOS global digital surface model (DSM), ALOS world 3D30m (AW3D30) Version 2.2 product description. Note: https://www.eorc.jaxa.jp/ALOS/en/aw3d30/aw3d30v22_product_e.pdf[Online; accessed 11September2019] Cited by: §2.
 Indatacenter performance analysis of a tensor processing unit. In ACM/IEEE 44th Ann. Int. Symp. on Comp. Architecture (ISCA), pp. 1–12. Cited by: §3.2.
 [9] Shuttle radar topography mission. Note: https://www2.jpl.nasa.gov/srtm/[Online; accessed 11September2019] Cited by: §2.
 Microsoft COCO: common objects in context. In European Conf. Comp. Vision (ECCV), D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars (Eds.), pp. 740–755. Cited by: §2.4.
 Floods. In Public Health Consequences of Disasters, Cited by: §1.
 Microwave remote sensing of flood inundation. Physics and Chemistry of the Earth 83, pp. 84–95. Cited by: §3.3, §3.4.
 Domain decomposition methods for partial differential equations. In Parallel Numerical Algorithms, pp. 225–243. Cited by: §3.2, §3.2.
 Bundle adjustment — a modern synthesis. In Vision Algorithms: Theory and Practice, B. Triggs, A. Zisserman, and R. Szeliski (Eds.), pp. 298–372. External Links: ISBN 9783540444800 Cited by: §2.1.
 Numerical methods for shallowwater flow. Springer. Cited by: §3.1.
 A highaccuracy map of global terrain elevations. Geophysical Research Letters 44 (11), pp. 5844–5853. External Links: Document, Link, https://agupubs.onlinelibrary.wiley.com/doi/pdf/10.1002/2017GL072874 Cited by: §2.
 High performance Monte Carlo simulation of Ising model on TPU clusters. In Proc. 2019 ACM/IEEE Int. Conf. for High Performance Computing, Networking, Storage, and Analysis (SC’19), Cited by: §3.2.
Comments
There are no comments yet.