Evaluating aleatoric and epistemic uncertainties of time series deep learning models for soil moisture predictions

by   Kuai Fang, et al.

Soil moisture is an important variable that determines floods, vegetation health, agriculture productivity, and land surface feedbacks to the atmosphere, etc. Accurately modeling soil moisture has important implications in both weather and climate models. The recently available satellite-based observations give us a unique opportunity to build data-driven models to predict soil moisture instead of using land surface models, but previously there was no uncertainty estimate. We tested Monte Carlo dropout (MCD) with an aleatoric term for our long short-term memory models for this problem, and asked if the uncertainty terms behave as they were argued to. We show that the method successfully captures the predictive error after tuning a hyperparameter on a representative training dataset. We show the MCD uncertainty estimate, as previously argued, does detect dissimilarity.



page 2


Learn to Estimate Labels Uncertainty for Quality Assurance

Deep Learning sets the state-of-the-art in many challenging tasks showin...

Model-free prediction of emergence of extreme events in a parametrically driven nonlinear dynamical system by Deep Learning

We predict the emergence of extreme events in a parametrically driven no...

Surrogate Ensemble Forecasting for Dynamic Climate Impact Models

As acute climate change impacts weather and climate variability, there i...

Long-term stability and generalization of observationally-constrained stochastic data-driven models for geophysical turbulence

Recent years have seen a surge in interest in building deep learning-bas...

How certain are your uncertainties?

Having a measure of uncertainty in the output of a deep learning method ...

Combining Physically-Based Modeling and Deep Learning for Fusing GRACE Satellite Data: Can We Learn from Mismatch?

Global hydrological and land surface models are increasingly used for tr...

Crop Planning using Stochastic Visual Optimization

As the world population increases and arable land decreases, it becomes ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Background

Soil moisture (), quantified by the volumetric fraction of soil occupied by water, critically controls various environmental and ecosystem processes such as photosynthesis, evapotranspiration, runoff, soil respiration, flooding, land-atmosphere interactions (Koster, 2004), etc. For climate and weather modeling, soil moisture is typically provided by land surface models (LSM). However, LSMs often introduce bias. For example, LSM-simulated moisture tends to be high-biased in the arid western CONUS and low-biased in wetter eastern US (Yuan & Quiring, 2017), or low-biased in wet seasons and high-biased in dry seasons (Xia et al., 2015). Prevalent bias with LSM-simulated moisture can introduce large errors to downstream applications including weather and climate modeling (Massey et al., 2016).

To reduce such bias, recent work introduced time series deep learning (DL) models to learn soil moisture dynamics directly from satellite-based observations. Satellites like Soil Moisture Active Passive (SMAP) mission (Entekhabi et al., 2010) now provides near real time monitoring of surface soil moisture and such data provide an opportunity for data-driven models. It now becomes possible to replace certain parts of the LSM functionality using machine learning predictions. Our previous work employed long short-term memory (LSTM) to predict SMAP soil moisture, given meteorological forcings (precipitation, temperature, radiation, etc) (Fang & Shen, 2017). They showed that LSTM can extend SMAP to spatiotemporally seamless coverage of continental US (CONUS) with high fidelity to SMAP. In addition, inter-annual trends of root-zone soil moisture were surprisingly well captured by LSTM even when the model was trained using only three years of data (Fang et al., 2018).

Despite such progress, few of these studies, or any studies utilizing LSTM in the field hydrology to our knowledge, addressed model uncertainties. Uncertainty is critical for understanding the limitations and data needs of the model (Pappenberger & Beven, 2006). In the context of data-driven modeling, uncertainty is often regarded as a combination of two elements: aleatoric and epistemic uncertainties. Aleatoric uncertainty is due to inherent stochasticity in the system. Epistemic uncertainty stems from our lack of knowledge (insufficient or biased training data) that could, in principle, be known.

In this paper we test a recently-proposed uncertainty quantification (UQ) framework for DL models: Monte Carlo dropout (MCD) with an aleatoric uncertainty term (Kendall & Gal, 2017)

. It was argued that doing Monte Carlo dropout with a neural network is equivalent to doing variation Bayesian inference in a deep Gaussian Process

(Gal & Ghahramani, 2016). However, few studies showed definite proof that the MCD detects similarity in the input space. Due to the approximate derivations in their work, we must ask (i) whether this scheme is truly successful at predicting errors; and (ii) do the terms behave as proposed, i.e., does the epistemic term measures similarity to the training data as a Gaussian Process does.

2 Method

MCD with an uncertainty term simultaneously estimates aleatoric and epistemic uncertainty. The MCD part measures disagreement among members in an ensemble of models generated by applying dropout (Srivastava et al., 2014). Gal & Ghahramani (2016) proposed that dropout training of deep networks was an approximation of training Gaussian Processes. Hence, we used dropout during prediction to create random ensemble, and used the variability of these predictions to quantify the epistemic uncertainty (

). The second is an input-dependent heteroscedastic model for observational noise

(Kendall & Gal, 2017) where a second output unit (

) can be added to a deep network. With a specially-designed loss function,

can represent an estimate of the variance of the network’s prediction. Finally we combined those two parts of uncertainty as

. The MCD uncertainty estimates require calibration by adjusting a hyper-parameter. Our model training period was the first year. To avoid over-tuning, we searched for a uniform drop out rate value so that the uncertainty in the second year matches the magnitude of the actual error during that period. We reported metrics from the test period (the third year).

3 Results and Discussion

Figure 1: Model error and uncertainty estimates of temporal generalization test (train in one year, tune the dropout rate in the second, and test in the third year) over CONUS. (a) unbiased root-mean squared error (ubRMSE), (b) , (c) pixel-based comparison of ubRMSE vs. , and (d) the calibration curve of the uncertainty estimates. Perfect uncertainty estimate is a straight one-to-one line.

To evaluate the overall quality of the uncertainty estimation, we trained a LSTM model to learn SMAP soil moisture dynamics for the entire CONUS using one year of data, and ran temporal test on another year. After tuning with the validation period, we chose a dropout rate of 0.6. As Figure 1a-b shows, the spatial pattern of agreed more or less with the predictive error (quantified by the unbiased root-mean-squared error, ubRMSE), and were larger in the East than in the West. SMAP signal is adversely impacted by large VWC (O’Neill et al., 2016) and freezing conditions further reduces the amount of training data for surface soil moisture (Fang et al., 2018). As a result, the Northeastern and Northwestern forests (along the Rocky mountains) had the highest ubRMSE. The lowest errors were found on the Great Plains and Southeast due to aridity and reduced forest cover. The predicted automatically captured these spatial patterns. This good performance is also evident from the high correlation between the (Figure 1c) and the nearly straight line in the calibration curve (the green line in Figure 1d). These results suggest that, for temporal prolongation, it is possible to anticipate model predictive errors using .

Do the two uncertainty terms behave as asserted, i.e., does the aleatoric term respond to stochasticity in the data and does the epistemic term respond to dissimilar cases? To answer this question, we examined the behaviors of the MCD uncertainty estimates when models are trained on a small basin and test on other regions.

Figure 2: Maps of when the LSTM model is trained in one of the HUC2 basins. The training region is highlighted by the red polygon.

We trained models on each of the 18 level-2 hydrologic cataloging unit (HUC2) basins dividing CONUS and we show the when the model is tested in other regions. is the smallest inside the training region, somewhat larger on the neighboring region, and much larger further away (Figure 2). This result provides a clear and novel evidence that MCD does detect dissimilarity in the input space, which manifest as geographic proximity in this case. Note our inputs do not have any attribute that directly represents location, and therefore the sense of proximity or similarity was discovered by the network itself via lands surface characteristics such as soil texture and climatology, which are auto-correlated.

In summary, we evaluated the suitability of the MCD method with an aleatoric term for hydrologic datasets, using SMAP-based soil moisture product as a test case. Our evaluation shows that the proposed scheme can be effective at predicting the model error after tuning the dropout rate. Our results provide a unique and strong evidence that variational sampling via Monte Carlo dropout acts as a dissimilarity detector. The aleatoric term was also found to be effective.


  • Entekhabi et al. (2010) Entekhabi, D., Njoku, E. G., O’Neill, P. E., Kellogg, K. H., Crow, W. T., Edelstein, W. N., Entin, J. K., Goodman, S. D., Jackson, T. J., Johnson, J., Kimball, J., Piepmeier, J. R., Koster, R. D., Martin, N., McDonald, K. C., Moghaddam, M., Moran, S., Reichle, R., Shi, J. C., Spencer, M. W., Thurman, S. W., Tsang, L., and Van Zyl, J. The soil moisture active passive (SMAP) mission. Proceedings of the IEEE, 98(5):704–716, 2010. ISSN 00189219. doi: 10.1109/JPROC.2010.2043918.
  • Fang & Shen (2017) Fang, K. and Shen, C. Full-flow-regime storage-streamflow correlation patterns provide insights into hydrologic functioning over the continental US. Water Resources Research, 2017. doi: 10.1002/2016WR020283.
  • Fang et al. (2018) Fang, K., Pan, M., and Shen, C. The Value of SMAP for Long-Term Soil Moisture Estimation With the Help of Deep Learning. IEEE Transactions on Geoscience and Remote Sensing, PP(Dl):1–13, 2018. ISSN 0196-2892. doi: 10.1109/TGRS.2018.2872131. URL https://ieeexplore.ieee.org/document/8497052/.
  • Gal & Ghahramani (2016) Gal, Y. and Ghahramani, Z. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In Proceedings of The 33rd International Conference on Machine Learning, 2016.
  • Kendall & Gal (2017) Kendall, A. and Gal, Y.

    What uncertainties do we need in bayesian deep learning for computer vision?

    In Advances in Neural Information Processing Systems (NIPS), 2017.
  • Koster (2004) Koster, R. D. Regions of Strong Coupling Between Soil Moisture and Precipitation. Science, 305(5687):1138–1140, aug 2004. ISSN 0036-8075. doi: 10.1126/science.1100217. URL http://www.sciencemag.org/cgi/doi/10.1126/science.1100217.
  • Massey et al. (2016) Massey, J. D., Steenburgh, W. J., Knievel, J. C., Cheng, W. Y. Y., Massey, J. D., Steenburgh, W. J., Knievel, J. C., and Cheng, W. Y. Y. Regional Soil Moisture Biases and Their Influence on WRF Model Temperature Forecasts over the Intermountain West. Weather and Forecasting, 31(1):197–216, feb 2016. ISSN 0882-8156. doi: 10.1175/WAF-D-15-0073.1. URL http://journals.ametsoc.org/doi/10.1175/WAF-D-15-0073.1.
  • O’Neill et al. (2016) O’Neill, P., Chan, S., Njoku, E., Jackson, T., and Bindlish, R. SMAP L3 Radiometer Global Daily 36 km EASE-Grid Soil Moisture, Version 4, 2016. URL https://nsidc.org/data/SPL3SMP/versions/4.
  • Pappenberger & Beven (2006) Pappenberger, F. and Beven, K. J. Ignorance is bliss: Or seven reasons not to use uncertainty analysis. Water Resources Research, 42(5):1–8, may 2006. ISSN 00431397. doi: 10.1029/2005WR004820. URL http://doi.wiley.com/10.1029/2005WR004820.
  • Srivastava et al. (2014) Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15:1929–1958, 2014.
  • Xia et al. (2015) Xia, Y., Ek, M. B., Wu, Y., Ford, T., Quiring, S. M., Xia, Y., Ek, M. B., Wu, Y., Ford, T., and Quiring, S. M. Comparison of NLDAS-2 Simulated and NASMD Observed Daily Soil Moisture. Part I: Comparison and Analysis. Journal of Hydrometeorology, 16(5):1962–1980, oct 2015. ISSN 1525-755X. doi: 10.1175/JHM-D-14-0096.1. URL http://journals.ametsoc.org/doi/10.1175/JHM-D-14-0096.1.
  • Yuan & Quiring (2017) Yuan, S. and Quiring, S. M. Evaluation of soil moisture in CMIP5 simulations over the contiguous United States using in situ and satellite observations. Hydrology and Earth System Sciences, 21(4):2203–2218, apr 2017. ISSN 1607-7938. doi: 10.5194/hess-21-2203-2017. URL https://www.hydrol-earth-syst-sci.net/21/2203/2017/.