1 Introduction
Soil moisture is a key variable that controls various hydrologic processes, including infiltration, evapotranspiration and subsurface flow. It is of central importance to drought monitoring (Narasimhan and Srinivasan, 2005), floods prediction (Norbiato et al., 2008), weather forecasting (Koster, 2004), irrigation planning and many other scientifically- and socially-important applications. Launched in 2015, NASA’s Soil Moisture Active Passive (SMAP) satellite mission (Entekhabi et al., 2010)
is designed to measure top 5 cm soil moisture globally with a standard deviation of
cm/cm volumetric ratio when vegetation water content (VWC) kg/m (O’Neill et al., 2012). It achieved this goal in most core evaluation sites (Colliander et al., 2017; Jackson et al., 2016). Notwithstanding its great value, SMAP passive radiometer-based observations only have a short time span (since April 2015) with an irregular revisit time of 2-3 days, which makes it difficult to observe soil moisture responses immediately after storms or snowmelt.Compared to SMAP’s limited resolution and time span, land surface models (LSMs), e.g., VIC (Nijssen et al., 2001), Noah (Ek et al., 2003), CLSM (Koster et al., 2000) and MOS (Koster and Suarez, 1994), simulate soil moisture seamlessly over much longer time spans. Despite their frequent use, these models may differ significantly from observations (Leeper et al., 2017; Yuan and Quiring, 2017; Dirmeyer et al., 2016; Xia et al., 2015). Biases (mean difference from the observed) are notable in all models evaluated in Xia et al. (2015)
. Their error patterns generally vary by region, model, season, and soil depths, yet there are systematic patterns in them. For example, moisture tends to be over-estimated in the arid western CONUS and under-estimated in wetter eastern US
(Yuan and Quiring, 2017); the Noah model tends to under-estimate moisture in wet seasons and over-estimate in dry seasons (Xia et al., 2015). These systematic error patterns could be exploited to improve predictions.To correct systematic model errors, we turn to deep learning (DL), a rebranding of artificial neural network. DL has made revolutionary strides in recent years and helped to solve problems that have resisted artificial intelligence for decades
(LeCun et al., 2015). With earlier-generation machine learning methods, human experts extract features from data that are strongly correlated with dependent variables. DL, on the other hand, automatically extracts abstract features through their hidden layers. Two highly successful network structures are convolutional neural networks (CNN) for image-domain tasks
(Krizhevsky et al., 2012), and Long Short-Term Memory (LSTM) (Hochreiter et al., 1997; Greff et al., 2015) for time-domain tasks, although the separation is not absolute. No study, to the best of the authors’ knowledge, has employed time series deep learning in hydrology. Given DL’s success in other scientific disciplines (Voosen, 2017), it is plausible that DL can capture model error patterns that humans have yet come to explicitly formulate.The parameter space of deep networks is substantially large in order to provide the flexibility in mapping diverse, complex functions. Thus one might be concerned about overfitting, which means coefficients are fitted to noise rather than meaningful information. However, there are recent breakthroughs in regularization techniques, e.g., Dropout (Srivastava et al., 2014), which penalize overfitting and reduce mutually dependent coefficients. Nevertheless, since LSTM has not been applied to hydrology, it is important to examine its robustness compared to conventional statistical methods.
The central hypothesis of this work is that with two years of SMAP data, LSTM can learn patterns in soil moisture dynamics and LSMs errors, and by utilizing them, can SMAP data over long time spans. Our objectives are: (1) to produce a seamless top-surface soil moisture dataset for continental United States (CONUS) with high fidelity to SMAP data; (2) to provide an initial investigation of LSTM’s capability in correcting process-based model errors; (3) to compare LSTM’s generalization capability to conventional methods in spatial and temporal extrapolation tests. Here, by ”high fidelity”, we mean a high consistency with the target data resulting in its faithful reproduction. SMAP’s retrieval algorithm for the passive product derives soil moisture from brightness temperature readings using radiative transfer and soil dielectric models (O’Neill et al., 2012), thus it also incurs biases (Colliander et al., 2017). Nevertheless, a high-fidelity hindcast product has a wide range of applications, e.g., data mining of past fire hazards, calibration of hydrologic models, or benchmarking satellite product with historical in-situ data.
2 Methods and Datasets
As an overview, we trained an LSTM network to predict SMAP L3 product with, as inputs, atmospheric forcing time series, LSM-simulated surface soil moisture and static physiographic attributes. We compared LSTM to regularized multiple linear regression, auto-regressive models, and a simple one-layer feedforward neural network. Their performances were tested by (i) temporal generalization test: training over one year and testing over another; (ii) regular spatial generalization test: training over a uniformly down-selected subset of SMAP pixels and testing over other cells; and (iii) regional holdout test: training on some sub-regions of CONUS and test on the rest. All data sources are aggregated to a daily time scale and interpolated to SMAP L3 grid. Each SMAP pixel is an
instance.2.1 Data sources and inputs
For the learning target, we focus on the L3 passive radiometer product (L3_SM_P) which combines swaths available in each day. The spatial resolution of L3_SM_P is 36 km. For inputs, we obtained atmospheric forcing data including precipitation, temperature, radiation, humidity and wind speed from North-American Land Data Assimilation System phase II (NLDAS-2) (Xia et al., 2015). NLDAS-2 also provides, from 1979 to present, simulations of land surface states and fluxes by several LSMs. We chose Noah’s (and also compared with MOS’s) outputs (Ek et al., 2003) because it ranks in the middle among models (Xia et al., 2015) and is not as extensively calibrated as some other models, e.g., SAC. Our work does not require the best LSM, as we can observe how LSTM and other methods correct LSM errors. Noah has 4 soil layers which are of depths 0-10, 10-40, 40-100 and 100-200 cm, respectively. To match with the 0-5 cm sensed by SMAP, we tested: (i) directly using 0-10 cm data; (ii) polynomial interpolation; and (iii) integral interpolation where we find polynomials whose integrals agree with Noah-simulated values.
Static physiographic attributes (Table S1 in SI) include sand, silt and clay percentages, bulk density and soil capacity from ISRIC-WISE (Batjes, 1995). County-level annual-average irrigation data for 2010 (USGS, 2016) was overlaid with landuse data to assign irrigation in each county to agricultural land uses. The values are then aggregated to SMAP grid. Also among inputs are SMAP product flags that indicate mountainous terrain, land cover classes, VWC, urban area, water body fraction and data quality (time-averaged). SMAP product flags indicate lower data quality in dense vegetated or forest area. However, instead of removing all regions labeled as low-quality, we hypothesize that including the flags as inputs allows LSTM to implicitly assign less focus to high-uncertainty regions.
2.2 LSTM setup
As a type of Recurrent Neural Network (RNN), LSTM makes use of sequential information by updating states based on both inputs of the current time step (
) and network states of previous time steps, as illustrated in Figure S1 in Supporting information (SI). Following the notation in Lipton et al. (2015), we can write an LSTM as LSTM : . The update formula are:(input node) | (1) | ||||
(input gate) | (2) | ||||
(forget gate) | (3) | ||||
(output gate) | (4) | ||||
(cell state) | (5) | ||||
(hidden gate) | (6) | ||||
(output layer) | (7) |
where
is the sigmoidal function,
is element-wise multiplication,is the input vector (forcings and static attributes) for the time step
, ’s are the network weights, ’s are bias parameters, is the output to be compared to observations, is the hidden states, and is called the cell states of memory cells, which is unique to LSTM. Readers are referred to the literature for the detailed functionality of these units. Summarized briefly, , , control, respectively, when the input is significant enough to use, how long the past states should be remembered for, and how much the value in memory is used to compute the output. During training, ’s and ’s are adjusted using back-propagation through time (BPTT). In BPTT, the network is first unrolled over a prescribed length before the difference between the output and target propagates into the network. We used the LSTM implemented in Recurrent Neural Network library for Torch7 (Léonard et al., 2015), which is a scientific computing framework for the programming language Lua. We employed Dropout regularization, which randomly sets a fraction (dropout rate, ) of its operand to. Dropout prevents the co-adaptation of neurons and thus reduces overfitting. We used dropout regularization to non-recurrent links as in
Zaremba et al. (2015), a constant dropout mask to recurrent connections as in Gal and Ghahramani (2015). We also implemented dropout for the memory cell as described in Semeniuta et al. (2016).At each time step, the network outputs one scalar value (
), which is compared to SMAP L3 passive product. The loss function to be minimized is the mean-squared error calculated for the time series:
(8) |
where is when time step has SMAP observation and is 0 otherwise, is the length of the time series, and is SMAP observation. For computational efficiency and stability reasons, the training is done through ”mini-batches”: for each batch, a number of instances, or SMAP pixels, are randomly collected from the training set. The loss function is then averaged over all the instances in a batch.
2.3 Tests, Conventional Algorithms, and Evaluation Metrics
In our temporal generalization test, the training set is SMAP data from April 2015 to March 2016. For computational efficiency, we picked 1 pixel from every 4 x 4 patch, resulting in a 1/16 coverage of CONUS. The test set is SMAP data for the same pixels, but for the period from April 2016 to March 2017. In the regular spatial generalization test, the training set is the same as described above, but the test set is the neighboring cells for the same period. In the regional holdout test, we trained models over 4 of the 18 2-digit Hydrologic Cataloging Units (HUC2s) and tested on others. This test challenges the ability of different methods to generalize across characteristically different climates and physiographic conditions. There are a large number of such 4-HUC2 combinations. As an initial investigation, we chose 4 of such combinations (C1-C4). Two combinations have a broad coverage of the range of Noah’s bias over CONUS, while the other two cover only part of that range. These tests inform us the effect of biased training sets on generalization.
LSTM predictions are compared to three conventional methods: the least absolute shrinkage and selection operator (lasso), auto-regressive model (AR), and a single-layer feedforward Neural Network (NN), given the same inputs. Lasso, shorthanded as LR here, is multiple linear regression with a regularization term that penalizes large regression coefficients (Tibshirani, 1994). NN can construct nonlinear transformations of inputs, but does not have memory, and therefore cannot retain time dependencies. LR and NN are operated in (1) the CONUS-scale mode, where a single model is trained for the entire training set; and (2) point-by-point mode, with subscript ””, where a separate model is trained for each pixel. AR models are trained only point-by-point (AR). More details are provided in SI Text S1. Three statistical metrics, bias (the time-averaged difference), root-mean-squared error (RMSE) and Pearson’s correlation () are calculated between between predicted and SMAP-observed soil moisture on training and test sets separately. measures the agreement between simulated and observed climatology.
While short-term forecast employs observations to continuously update solution, long-term hindcast has no observations to use. As a ”proof-of-concept” test of LSTM’s appropriateness for long-term hindcast, we trained LSTM and AR using 2 years of Noah-simulated soil moisture as the target, to hindcast to 10 years back.
3 Results
3.1 Overall test performance
For the temporal generalization test, we note substantial improvement with respect to both bias and compared to Noah (Figure 1). We report results from directly using the 0-10 cm Noah layer, although other choices are similar, as will be discussed later. The Noah solutions have a significant, spatially-varying bias in many parts of CONUS, as shown in Figure S3a in SI, especially in southeastern coastal plains (annotated in Figure S2 in SI). The LSTM correction reduces the bias by an order of magnitude, and mostly removed the spatial pattern of bias (Figure 1a). We note there is a CONUS-scale spatial trend of larger reduction of absolute bias in the Eastern CONUS, except the southeast coast (Figure 1b). The gradient appears to be related to the CONUS annual precipitation map, as it corrects Noah’s bias to over-estimate in the arid west while over-estimate in the humid east (Figure S3a, also noted in (Xia et al., 2015)).
LSTM does not only reduce bias but more noticeably improve the climatology, according to (Figure 1c,d), which only concerns the comparison in temporal fluctuations. LSTM is mostly above and 50% pixels are over 0.9, significantly above Noah. In most CONUS the improvement is greater than , while it can be in the Eastern half CONUS, especially the agricultural regions in central lowland and on the Appalachian Plateau (Figure 1d). We note this map is no longer similar to the bias map of Noah, suggesting mechanisms that correct seasonality are different from those correcting bias. We hypothesize LSTM significantly improves soil moisture dynamics in agricultural regions, e.g., irrigation, and the influence of shallow soils on the Appalachian highlands (Fang and Shen, 2017). On the other hand, over the majority of CONUS, the RMSE of LSTM is lower than 0.035 (Figure 1e). A continental-scale west-to-east increasing trend in RMSE(LSTM) is apparent. The higher errors in the East may result from higher annual precipitation, which results in (i) higher annual-mean soil moisture, and (ii) high VWC, which reduces SMAP data quality. However, the Southeast regions facing the Atlantic has a rather low RMSE(LSTM). Figure 1f suggests the improvement of LSTM over the one-layer NN is obvious, especially in the central lowland region and coastal plains.









3.2 Comparison of generalization capability with other methods
In the temporal generalization test, time series prolonged by LSTM compares favorably against AR across a wide range of LSTM performance levels (Figure 2). For the 10-th to even 75-th percentile pixel, LSTM is able to closely follow SMAP, except that peaks are under-estimated in the 50-th percentile pixel in 2016. The frequent rain events in April-May 2016 in Figure 2a and their recessions are well captured. For the 75-th percentile pixel, all peaks are captured, but we notice some over-estimation near the troughs in August 2016. In contrast, while AR is also behavioral, we notice it often noticeably under-estimates the troughs, over-estimates the seasonal rising limbs and overshoots some peaks. In the 10-th percentile cell, AR performs poorly between October 2016 and early 2017.
Summarized over CONUS, LSTM shows the lowest test RMSE and bias, and the highest (Figure 3), followed by NN, LR, AR, LR, NN and lastly Noah. Neither the vertical interpolation procedure nor the choice of LSM (MOS or Noah) has much impact on LSTM’s prolongation performance (see Figures S4 and S5 in SI). The test RMSEs of LSTM are 0.022, 0.027, 0.036 and 0.057 for the , , and percentile pixel, respectively (Figure 3a). With lasso regularization, LR has similar training and test RMSEs, but its 25-th percentile test RMSE is similar to the -th percentile of LSTM’s. Therefore, the more complex relationships permitted by LSTM are beneficial. The LR improves from LR as it specializes in each pixel, and the lasso regularization appears to have prevented overfitting, but its error is still larger than the CONUS-scale LSTM. NN and AR appear more overfitted than LR. LSTM’s test bias is only moderately smaller than that of NN, AR and LR, but is much higher. 75% and 50% of (LSTM) are greater than 0.80 and 0.87, respectively.
Note AR has sub-par performance in both training and test periods. The test RMSE box for AR is wider, suggesting its formulation works well for some pixels but not so well in others. Furthermore, the extended proof-of-concept long-term hindcast experiment shows a similar contrast. LSTM has robust prolongation performance at a 10-year hindcast scale while AR generates larger errors. Errors for both methods are independent of hindcast lengths, i.e., 10-year-prior hindcast error is not much different from 2-year hindcast (Text S2 and Figure S6 in SI). Meanwhile, in the regular spatial generalization test, LSTM again exhibits the smallest RMSE and bias (Figure 3c-d). The contrast in bias is smaller than the temporal test, but the comparisons are similar.
In 4-HUC2 combinations 1 and 2 (C1 and C2), the Noah bias covers a wide range from -0.25 to 0.15 cm/cm, which appears to be the whole range of the Noah biases we see over CONUS. In both cases, LSTM is able to greatly reduce the bias and improve soil moisture climatology (much higher ) compared to both NN and Noah (Figure 4a-b). The boxes corresponding to LSTM bias are very narrow, and its centers are nearly 0. For the case C3, we note that it has few points with bias ¡-0.2, so for this HUC2 combination, the training set under-samples the Noah errors that lead to strong negative biases. As a result, LSTM’s bias is no longer near 0, although still much better than NN’s and Noah’s. On the other hand, for C4, the training set is strongly biased. It lacks any basin with a Noah bias of ¡-0.1. We note the narrow box corresponding to Noah’s bias in the training set for C4. Unsurprisingly, LSTM’s performance deteriorates: LSTM is no longer able to correct the bias, and its range of bias is large. NN, similarly, also fails to correct the bias. We obtained LSTM’s self-assessment of Noah bias by subtracting Noah’s solution from LSTM’s prediction. The self-assessed bias (Figure 4a) has a range of bias which has little overlap with the Noah bias in the training set. This may be a signal we can utilize in the future to identify biased training sample and large predictive uncertainty.
4 Discussion
In many parts of CONUS, LSTM’s RMSE is smaller than SMAP’s design measurement accuracy. It appears even with 1 year of data over CONUS, when grouped together, has enough information to train an LSTM to hindcast SMAP data. A factor that contributes to such performance is the short memory length of soil moisture, which was found to range between 5 to 40 days (Orth and Seneviratne, 2012) in previous work. As a result, two years of data, when grouped together, contain many complete wet-dry cycles. The hindcast quality should improve as SMAP data increases. On a side note, because true random noise cannot be predicted in the test set, it follows that SMAP L3’s RMSE could be below 0.027 in 50% of CONUS. Also, the official SMAP data quality flag labels the forested Southeast Coastal Plains and South Atlantic regions as ”not recommended” quality (Figure S3). Our LSTM has a RMSE of 0.02-0.035 there, which suggest SMAP may be functional in these regions, but the impacts of the retrieval algorithm should be carefully examined.
It seems non-recurrent NN can already remove a large part of bias by capturing how environmental factors lead to certain type of biases. However, NN cannot maintain time dependencies, which may explain its performance difference from LSTM. Therefore, we argue an advantage of LSTM originates from its recurrent nature. Meanwhile, alternative recurrent models, e.g., AR and moisture loss functions (Koster et al., 2017), are profoundly useful due to their interpretability, parsimony and great value in ”nowcasting” or short-term forecasting (see Koster et al. (2017) for a solid application), but they require continuous updates by observations to avoid drift from true dynamics. At one-year scale, injected data already has little effects on hindcast solutions. For longer-term hindcast, pattern-based methods like LSTM appear to be more suitable.
Previous soil moisture comparisons mainly focused on anomalies, but the prevalent bias with Noah’s surface moisture simulations can cause large errors in downstream users such as weather modeling (Massey et al., 2016). The continental-scale bias pattern suggests some systematic errors with Noah’s model structure/parameters. Some hypotheses include (i) Noah’s soil pedo-transfer functions are fundamentally inadequate in resolving regionally heterogeneous soil responses to rainfall, which could explain the need for calibration in most large-scale flood forecasting systems; or (ii) groundwater flow, which is important in thick-soiled, high-infiltrating capacity regions like the southeast (Fang et al., 2016), is not properly simulated in LSMs (Clark et al., 2015). However, LSTM appears to be able to integrate information from raw data and compensate for the inadequate representation uniformly over CONUS.
Conventional statistical wisdom suggests that simpler models are more robust and models with high degrees of freedom may be easily overfitted. However, the present work shows the CONUS-scale deep learning networks have smaller test errors than three alternative methods trained point-by-point. In fact, we hypothesize an important strength of LSTM originates from its flexibility to simultaneously learn from a large and heterogeneous collection of data and identify commonalities and differences. Its generalization capability stems from building internal models (in the attribute space) to capture biases and temporal fluctuations. For our regional holdout test, creating these internal models does not seem to require having all combinations of climates and physical attributes in the training set, as the HUC2s have distinct climates, topography, landcover and soils.
5 Conclusion
We have trained a CONUS-scale LSTM network to predict SMAP data. This network is capable of correcting spatially-heterogeneous model bias as well as climatological errors between Noah-simulated and SMAP-observed top-surface soil moisture, creating a CONUS-scale seamless moisture product that has high fidelity to SMAP. Despite having high degrees of freedom, when properly regularized, LSTM exhibits better generalization capability, both in space and time, than linear regression, auto-regressive models, and a one-layer neural network. Its test error approaches the instrument accuracy limit with SMAP. LSTM will be helpful in long-range soil moisture hindcasting or forecasting, weather modeling, and data assimilation. Its generalization capability arises from building internal models from physical attributes and synthesis of climate forcing. It does not necessarily require similar examples in the training set. Unless the training set is strongly biased, LSTM has a good chance of success.
6 Limitations and Future Work
As a first paper using LSTM in hydrology, this work is by no means a thorough investigation. Optimization is certainly possible. Our work does not address the question about the accuracy of SMAP data, which is addressed by other studies. The hindcast performance with respect to capturing soil moisture during drought should be further examined with in-situ data. We should further assess LSTM’s performance in comparison with regionally-trained simpler models. The implications of low LSTM RMSEs in forested region warrants further investigations.
7 Acknowledgment
Data for SMAP can be downloaded from SMAP’s data repository. Work is supported by seed grants from Penn State College of Engineering and Institute for CyberScience. Shen is partially supported by Office of Biological and Environmental Research of the US Department of Energy under contract DE-SC0010620. We thanks NVIDIA Corporation for donating a Tesla K40 GPU for this research.
SMAP Soil Moisture Active Passive DL Deep learning LSTM Long Short-Term Memory LR Linear Regression ANN Artificial Neural Network
References
- Batjes (1995) Batjes, N. H. (1995), A Homogenized Soil Data File for Global Environmental Research: A Subset of FAO, ISRIC, and NRCS profiles (Version 1.0), Tech. Rep. July, ISRIC.
- Clark et al. (2015) Clark, M. P., Y. Fan, D. M. Lawrence, J. C. Adam, D. Bolster, D. J. Gochis, R. P. Hooper, M. Kumar, L. R. Leung, D. S. Mackay, R. M. Maxwell, C. Shen, S. C. Swenson, and X. Zeng (2015), Improving the representation of hydrologic processes in Earth System Models, Water Resources Research, 51(8), 5929–5956, doi:10.1002/2015WR017096.
- Colliander et al. (2017) Colliander, A., T. Jackson, R. Bindlish, S. Chan, N. Das, S. Kim, M. Cosh, R. Dunbar, L. Dang, L. Pashaian, J. Asanuma, K. Aida, A. Berg, T. Rowlandson, D. Bosch, T. Caldwell, K. Caylor, D. Goodrich, H. al Jassar, E. Lopez-Baeza, J. Martínez-Fernández, A. González-Zamora, S. Livingston, H. McNairn, A. Pacheco, M. Moghaddam, C. Montzka, C. Notarnicola, G. Niedrist, T. Pellarin, J. Prueger, J. Pulliainen, K. Rautiainen, J. Ramos, M. Seyfried, P. Starks, Z. Su, Y. Zeng, R. van der Velde, M. Thibeault, W. Dorigo, M. Vreugdenhil, J. Walker, X. Wu, A. Monerris, P. O’Neill, D. Entekhabi, E. Njoku, and S. Yueh (2017), Validation of SMAP surface soil moisture products with core validation sites, Remote Sensing of Environment, 191, 215–231, doi:10.1016/j.rse.2017.01.021.
- Dirmeyer et al. (2016) Dirmeyer, P. A., J. Wu, H. E. Norton, W. A. Dorigo, S. M. Quiring, T. W. Ford, J. A. Santanello, M. G. Bosilovich, M. B. Ek, R. D. Koster, G. Balsamo, D. M. Lawrence, P. A. Dirmeyer, J. Wu, H. E. Norton, W. A. Dorigo, S. M. Quiring, T. W. Ford, J. A. S. Jr., M. G. Bosilovich, M. B. Ek, R. D. Koster, G. Balsamo, and D. M. Lawrence (2016), Confronting Weather and Climate Models with Observational Data from Soil Moisture Networks over the United States, Journal of Hydrometeorology, 17(4), 1049–1067, doi:10.1175/JHM-D-15-0196.1.
- Ek et al. (2003) Ek, M. B., K. E. Mitchell, Y. Lin, E. Rogers, P. Grunmann, V. Koren, G. Gayno, and J. D. Tarpley (2003), Implementation of Noah land surface model advances in the National Centers for Environmental Prediction operational mesoscale Eta model, Journal of Geophysical Research: Atmospheres, 108(D22), doi:10.1029/2002JD003296.
- Entekhabi et al. (2010) Entekhabi, D., E. G. Njoku, P. E. O’Neill, K. H. Kellogg, W. T. Crow, W. N. Edelstein, J. K. Entin, S. D. Goodman, T. J. Jackson, J. Johnson, J. Kimball, J. R. Piepmeier, R. D. Koster, N. Martin, K. C. McDonald, M. Moghaddam, S. Moran, R. Reichle, J. C. Shi, M. W. Spencer, S. W. Thurman, L. Tsang, and J. Van Zyl (2010), The soil moisture active passive (SMAP) mission, Proceedings of the IEEE, 98(5), 704–716, doi:10.1109/JPROC.2010.2043918.
- Fang and Shen (2017) Fang, K., and C. Shen (2017), Full-flow-regime storage-streamflow correlation patterns provide insights into hydrologic functioning over the continental US, Water Resources Research, doi:10.1002/2016WR020283.
- Fang et al. (2016) Fang, K., C. Shen, J. B. Fisher, and J. Niu (2016), Improving Budyko curve-based estimates of long-term water partitioning using hydrologic signatures fromGRACE, Water Resources Research, pp. 1–20, doi:10.1002/2014WR015716.
- Gal and Ghahramani (2015) Gal, Y., and Z. Ghahramani (2015), A Theoretically Grounded Application of Dropout in Recurrent Neural Networks, http://arxiv.org/abs/1512.05287.
- Greff et al. (2015) Greff, K., R. K. Srivastava, J. Koutník, B. R. Steunebrink, and J. Schmidhuber (2015), LSTM: A Search Space Odyssey, http://arxiv.org/abs/1503.04069.
- Hochreiter et al. (1997) Hochreiter, S., S. Hochreiter, J. Schmidhuber, and J. Schmidhuber (1997), Long short-term memory., Neural computation, 9(8), 1735–80, doi:10.1162/neco.1997.9.8.1735.
- Jackson et al. (2016) Jackson, T., P. O’Neill, E. Njoku, S. Chan, R. Bindlish, A. Colliander, F. Chen, M. Burgin, S. Dunbar, J. Piepmeier, M. Cosh, T. Caldwell, J. Walker, X. Wu, A. Berg, T. Rowlandson, A. Pacheco, H. McNairn, M. Thibeault, J. Martínez-Fernández9, Á. González-Zamora, M. Seyfried10, D. Bosch, P. Starks, D. Goodrich, J. Prueger, Z. Su, R. van der Velde, J. Asanuma, M. Palecki, E. Small, M. Zreda, J. Calvet, W. Crow, Y. Kerr, S. Yueh, and D. Entekhabi (2016), Soil Moisture Active Passive (SMAP) Project Calibration and Validation for the L2/3_SM_P Version 3 Data Products.
- Koster (2004) Koster, R. D. (2004), Regions of Strong Coupling Between Soil Moisture and Precipitation, Science, 305(5687), 1138–1140, doi:10.1126/science.1100217.
- Koster and Suarez (1994) Koster, R. D., and M. J. Suarez (1994), The components of a ‘SVAT’ scheme and their effects on a GCM’s hydrological cycle, Advances in Water Resources, 17(1-2), 61–78, doi:10.1016/0309-1708(94)90024-8.
- Koster et al. (2000) Koster, R. D., M. J. Suarez, A. Ducharne, M. Stieglitz, and P. Kumar (2000), A catchment-based approach to modeling land surface processes in a general circulation model 1. Model structure, Journal of Geophysical Research D: Atmospheres, 105(D20), 24,809–24,822.
- Koster et al. (2017) Koster, R. D., R. H. Reichle, S. P. P. Mahanama, R. D. Koster, R. H. Reichle, and S. P. P. Mahanama (2017), A Data-Driven Approach for Daily Real-Time Estimates and Forecasts of Near-Surface Soil Moisture, Journal of Hydrometeorology, doi:10.1175/JHM-D-16-0285.1.
-
Krizhevsky et al. (2012)
Krizhevsky, A., I. Sutskever, and G. E. Hinton (2012), ImageNet Classification with Deep Convolutional Neural Networks,
Advances In Neural Information Processing Systems, pp. 1–9, doi:http://dx.doi.org/10.1016/j.protcy.2014.09.007. - LeCun et al. (2015) LeCun, Y., Y. Bengio, and G. Hinton (2015), Deep learning, Nature, 521(7553), 436–444, doi:10.1038/nature14539.
- Leeper et al. (2017) Leeper, R. D., J. E. Bell, C. Vines, M. Palecki, R. D. Leeper, J. E. Bell, C. Vines, and M. Palecki (2017), An Evaluation of the North American Regional Reanalysis Simulated Soil Moisture Conditions during the 2011–13 Drought Period, Journal of Hydrometeorology, 18(2), 515–527, doi:10.1175/JHM-D-16-0132.1.
-
Léonard et al. (2015)
Léonard, N., S. Waghmare, Y. Wang, and J.-H. Kim (2015), rnn : Recurrent Library for Torch,
https://github.com/Element-Research/rnn. - Lipton et al. (2015) Lipton, Z. C., J. Berkowitz, C. Elkan, Zachary C. Lipton, Z. C. Lipton, Zachary C. Lipton, J. Berkowitz, and C. Elkan (2015), A Critical Review of Recurrent Neural Networks for Sequence Learning, http://arxiv.org/abs/1506.00019v2, pp. 1–35, doi:10.1145/2647868.2654889.
- Massey et al. (2016) Massey, J. D., W. J. Steenburgh, J. C. Knievel, W. Y. Y. Cheng, J. D. Massey, W. J. Steenburgh, J. C. Knievel, and W. Y. Y. Cheng (2016), Regional Soil Moisture Biases and Their Influence on WRF Model Temperature Forecasts over the Intermountain West, Weather and Forecasting, 31(1), 197–216, doi:10.1175/WAF-D-15-0073.1.
- Narasimhan and Srinivasan (2005) Narasimhan, B., and R. Srinivasan (2005), Development and evaluation of Soil Moisture Deficit Index (SMDI) and Evapotranspiration Deficit Index (ETDI) for agricultural drought monitoring, Agricultural and Forest Meteorology, 133(1-4), 69–88, doi:10.1016/j.agrformet.2005.07.012.
- Nijssen et al. (2001) Nijssen, B., R. Schnur, and D. P. Lettenmaier (2001), Global Retrospective Estimation of Soil Moisture Using the Variable Infiltration Capacity Land Surface Model, 1980–93, Journal of Climate, 14(8), 1790–1808, doi:10.1175/1520-0442(2001)014¡1790:GREOSM¿2.0.CO;2.
- Norbiato et al. (2008) Norbiato, D., M. Borga, S. Degli Esposti, E. Gaume, and S. Anquetin (2008), Flash flood warning based on rainfall thresholds and soil moisture conditions: An assessment for gauged and ungauged basins, Journal of Hydrology, pp. 274–290, doi:10.1016/j.jhydrol.2008.08.023.
- O’Neill et al. (2012) O’Neill, P., S. Chan, E. Njoku, and T. J. R. Bindlish (2012), Algorithm Theoretical Basis Document (ATBD) SMAP Level 2 & 3 Soil Moisture (Passive) (L2_SM_P, L3_SM_P), https://nsidc.org/sites/nsidc.org/files/technical-references/L3_FT_P_ATBD_v7.pdf.
- Orth and Seneviratne (2012) Orth, R., and S. I. Seneviratne (2012), Analysis of soil moisture memory from observations in Europe, Journal of Geophysical Research, 117(D15), D15,115, doi:10.1029/2011JD017366.
- Semeniuta et al. (2016) Semeniuta, S., A. Severyn, and E. Barth (2016), Recurrent Dropout without Memory Loss, http://arxiv.org/abs/1603.05118.
- Srivastava et al. (2014) Srivastava, N., G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov (2014), Dropout: A Simple Way to Prevent Neural Networks from Overfitting, Journal of Machine Learning Research, 15, 1929–1958.
- Tibshirani (1994) Tibshirani, R. (1994), Regression Shrinkage and Selection Via the Lasso, Journal of the Royal Statistical Society, Series B, 58, 267—-288.
- USGS (2016) USGS (2016), Estimated Use of Water in the United States County-Level Data for 2010.
- Voosen (2017) Voosen, P. (2017), The AI detectives, Science, 357(6346).
- Xia et al. (2015) Xia, Y., M. B. Ek, Y. Wu, T. Ford, S. M. Quiring, Y. Xia, M. B. Ek, Y. Wu, T. Ford, and S. M. Quiring (2015), Comparison of NLDAS-2 Simulated and NASMD Observed Daily Soil Moisture. Part I: Comparison and Analysis, Journal of Hydrometeorology, 16(5), 1962–1980, doi:10.1175/JHM-D-14-0096.1.
- Yuan and Quiring (2017) Yuan, S., and S. M. Quiring (2017), Evaluation of soil moisture in CMIP5 simulations over the contiguous United States using in situ and satellite observations, Hydrology and Earth System Sciences, 21(4), 2203–2218, doi:10.5194/hess-21-2203-2017.
- Zaremba et al. (2015) Zaremba, W., I. Sutskever, and O. Vinyals (2015), Recurrent Neural Network Regularization, https://arxiv.org/abs/1409.2329.
Contents
-
Text S1. Technical details about conventional statistical methods.
-
Figure S1. Comparison between the structures of Long Short-Term Memory (LSTM) and simple recurrent neural network (RNN).
-
Figure S2. Map of SMAP’s data quality flag, with annotations for geographic regions on the continental US
-
Figure S3. Noah bias and RMSE evaluated against SMAP over CONUS.
-
Figure S4. Performance of training using Noah soil moisture interpolated to 5 cm depth.
-
Figure S5. Comparing LSTM models created with Noah or MOS models as inputs.
-
Text S2. Proof-of-concept test for the potential of LSTM for long-term hindcast.
-
Figure S6. Performance of LSTM and AR for the synthetic long-term hindcast experiment
-
Table S1. Predictors employed by LSTM, lasso-regularized linear regression and one-layer feedforward neural network
Text S1. Technical Details about Conventional Methods
We compared the Long-Short Term Memory (LSTM) network to the least absolute shrinkage and selection operator (lasso), auto-regressive moving average model (AR), and a single-layer feedforward Neural Network (NN), given the same inputs. Because lasso is essentially a regularized linear regression, it is shorthanded as LR in our paper. The equation for estimating the parameters for LR is:
(1) |
where is the SMAP soil moisture product, are coefficients for the LR model, is a regularization parameter that determines how much penalty is applied on large coefficients, and contains exogenous inputs including temperature, precipitation, wind, downward shortwave and long wave radiations, specific humidity, and Noah-simulated potential evapotranspiration, evaporation, and runoff. In alternative models that we examined, we also tested removing the list of Noah outputs. The regularization parameter () is determined experimentally to minimize the test error, and a value of 0.002 is found to be appropriate for LR, and point-by-point LR (LR).
We have added point-by-point auto-regressive model with exogenous inputs into the comparisons, meaning a separate model is trained for each SMAP pixel. We did not consider moving average models because our focus is on the potential of the method for long-term forecast, while moving-average models require observations to calculate residuals. The equation for the AR is: θ_t = c + ϵ_t + ∑_i=1^p α_iθ_t-i + ∑_k=1^r γ_kx_k,t where is a constant, is the time step, ’s are soil moisture observations, is the order of the auto-regression, and are coefficients that will be estimated for each SMAP pixel, and are forcing inputs as indicated above. For our long-term hindcast test. We could include static attributes in this equation but since they are static in time they will be absorbed by the constant , and because we are training point-by-point there is no reason to consider them. During parameter estimation (training) stage, observations are used to update the past states (). In the long-term hindcast (testing) stage, because there is no observation, are the AR-predicted values. The model has to recursively apply the forecast equation to proceed in time. We varied from 0 to 5 and identified the value that gave the smallest testing error for each site.
The one-layer feedforward neural network (NN) is simply a linear combination of inputs and a transformation: θ^NN(t)=f(W_NNx+b) where is the weights of the neural network, b is a constant coeffient and is a nonlinear transformation, in this case tan-sigmoid (). We regularized NN using early stopping and L2-norm regularization. A regularization parameter of 0.002 was found to be give the smallest test root-mean-squared error (RMSE). NN and its point-by-point version, NN, have a linear hidden layer of size 100 and 30, respectively, as larger hidden size results in more over-fitting.






Text S2. Proof-of-concept test for the potential of LSTM for long-term hindcast
Since SMAP has a limited time span, we conducted a proof-of-concept experiment that examines the potential of LSTM for multi-year-scale soil moisture hindcasting and compare it to point-by-point auto-regressive models (AR). These synthetic experiments are not thorough in performance optimization, as true hindcasting will involve auxiliary satellite-based observations and in-situ data. Both LSTM and AR are trained in 2015-2016 with Noah-simulated soil moisture as the target and climate forcing as the inputs. Based on the temporal generalization test described in the main text, we removed all Noah-simulated fields from the inputs, and, since AR
does not require any static attributes like topography and soil texture, we also removed such attributes from LSTM’s inputs. We added two types of synthetic noise to Noah solutions: a Gaussian white noise (with standard deviation = 0.04) and a relative error. Neither types of noise is auto-correlated. The formulae for the relative error is:
θ_s=θ_Noah*(1+ϵ) where is the synthetic observation to be treated as the learning target, is the top 10 cm soil moisture simulated by Noah, and is a Gaussian relative error term.The results show that with two years of training data, LSTM can well learn the soil moisture dynamics of Noah (Figure S6a-b). The median error for the white-noise case only slightly increases from 0.04 in the training period, which is almost equal to the added noise) to 0.043 in 2005-2006. Importantly, the hindcast noise does not increase as a function of hindcast length, i.e., distance from the first synthetic observation. The AR also works decently, with a median error around 0.049 in 2005-2006. Its error is also not influenced by hindcast length, perhaps because soil moisture dynamics simulated by Noah has only limited memory length. However, LSTM is still noticeably stronger as 85-th percentile of LSTM’s error in 2005-2006 is less than 25-th percentile error of AR. The LSTM boxes are much narrower than those of AR. Also, note that we only created one LSTM model for the continental U.S. (CONUS). In addition, LSTM can make use of static attributes to differentiate between locations with different soil textures and land covers, but AR cannot. Therefore, the performance of LSTM may further improve as these attributes are included. LSTM may compensate for the lack of attributes by summarizing information from climate forcings, as climate features co-vary with physical attributes. Figure S6c compares the hindcast time series at a pixel. We note that AR tends to over-predict major peaks but under-predict the rise limbs. LSTM well captures the troughs but AR may over-predict the troughs.
As Noah has simpler dynamics and less unknown variables than real systems, it is easier to learn so it is not surprising the errors are close to the added Gaussian noise. The larger error of AR during the training period suggest its formulation is not flexible enough to completely reproduce the dynamics of Noah. These results shown here mainly illustrates that LSTM has a great potential for long-term hindcasting. It appears from our results that since soil moisture has short memory, hindcasting to one year is not very different from hindcasting to 10 years. However, the training data should adequately sample plausible soil moisture dynamics.

NLDAS Model Outputs | |||
ALBDO | Albedo | RCSOL | Soil moisture parameter in canopy conductance |
ARAIN | Liquid precipitation | RCT | Temperature parameter in canopy conductance |
ASNOW | Frozen precipitation | RSMACR | Relative soil moisture availability control factor |
AVSFT | Average surface skin temperature | RSMIN | Minimal stomatal resistance |
BGRUN | Subsurface runoff (baseflow) | SBSNO | Sublimation (evaporation from snow) |
CCOND | Canopy conductance | SHTFL | Sensible heat flux |
CNWAT | Plant canopy surface water | SNOD | Snow depth |
GFLUX | Ground heat flux | SNOHF | Snow phase-change heat flux |
LAI | Leaf area index | SNOM | Snow melt |
LHTFL | Latent heat flux | SNOWC | Snow cover |
LSOIL | Liquid soil moisture content (non-frozen) | SOILM | Soil moisture content |
MSTAV | Moisture availability | SSRUN | Surface runoff (non-infiltrating) |
NLWRS | Net longwave radiation flux (surface) | TRANS | Transpiration |
NSWRS | Net shortwave radiation flux (surface) | TSOIL | Soil temperature |
PEVPR | Potential latent heat flux | VEG | Vegetation |
RCQ | Humidity parameter in canopy conductance | WEASD | Water equivalent of accumulated snow depth |
RCS | Solar parameter in canopy conductance | ||
NLDAS Forcing | |||
ACOND | Aerodynamic conductance | EVP | Evaporation |
ACPCP | Convective precipitation hourly total | HGT | Geopotential height |
APCP | Precipitation hourly total | PEVAP | Potential evaporation hourly total |
CAPE | above ground Convective Available Potential Energy | PRES | Surface pressure |
CONVfrac | Fraction of total precipitation that is convective | SPFH | Specific humidity |
DLWRF | Longwave radiation flux downwards | TMP | Temperature |
DSWRF | Shortwave radiation flux downwards | UGRD | Zonal wind speed |
SMAP Flags | |||
albedo | Albedo | vegewater | Vegetation water content |
coast | Coastal proximity | roughness | Roughness |
waterbody | Radar water body fraction | staWater | Static water |
landcover | Landcover classes | urban | Urban area |
mount | Mountainous terrain | vegetation | Dense vegetation |
Geographic attributes | |||
Bulk | Bulk density | Irri | Irrigation |
Capa | Soil capacity | Sand | Sand fraction |
Clay | Clay fraction | Silt | Silt fraction |
LULC | NLCD 2001 land cover and use type |
Comments
There are no comments yet.