A knowledge based spatial model for utilizing point and nested areal observations: A case study of annual runoff predictions in the Voss area

04/04/2019 ∙ by Thea Roksvåg, et al. ∙ NTNU 0

In this study, annual runoff is estimated by using a Bayesian geostatistical model for interpolating hydrological data of different spatial support. That is, streamflow observations from catchments (areal data), and precipitation and evaporation data (point data). The model contains one climatic spatial effect that is common for all years under study, and one year specific spatial effect. The climatic effect provides a framework for exploiting sparse datasets that include short records of runoff, and we obtain a model that can be used for spatial interpolation and for gaining knowledge about future annual runoff. The model's ability to predict annual runoff is investigated using 10 years of data from the Voss area in Western Norway and through a simulation study. On average we benefit from combining point and areal data compared to using only one of the data types, and the interaction between nested areal data and point data gives a geostatistical model that takes us beyond smoothing. Another finding is that climatic effects dominates over annual effects in Voss. This implies that systematic under- and overestimation of runoff often occur, but also that short-records of runoff from an otherwise ungauged catchment can lead to large improvements in the predictability of runoff.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 15

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Data related to meteorology, geology and hydrology are often connected to geographical locations in space. The data are typically connected to point locations, but there are also data observed over an areal unit, e.g to a crop field, a forest, a grid from a satellite observation or an administrative unit like a country. While point referenced data give information about the process of interest at one specific location, the areal referenced data impose a constraint on the process and/or contain information about aggregated or mean values in a larger area.

For some processes, there exist both point data and areal data that give information about the same underlying process, and studies show that both observation types should be taken into account when making statistical inference and predictions (Moraga et al., 2017; Wang et al., 2018). There are several challenges connected to simultaneously utilize data of different spatial support: It requires expert knowledge about how the different data types are connected to the process of interest and about the measurement uncertainties involved. In addition, information about how the point and areal data are related to each other is important, such that the observation types can be combined in a mathematically consistent way that preserves basic physical laws (i.e. the conservation of mass and energy).

In this article we consider runoff, which is an example of a process that can be observed through both point and areal data. Runoff is defined as the part of the precipitation which flows towards a river on the ground surface (surface runoff) or within the soil (subsurface runoff or interflow) (WMO, 1992). Every point in the landscape contributes to runoff generation, and on an annual scale runoff can be approximated by the estimated point precipitation minus the actual point evaporation at a location of interest (Sauquet et al., 2000). With this interpretation, runoff is a continuous point referenced process in space. However, runoff accumulated over an area is typically observed by measuring the amount of water that flows through the outlet of a stream. The observed value doesn’t primarily include information about the runoff generating process at the location of the stream gauge, but about the runoff generating process in the whole drainage area which is called a catchment. Such observations of runoff are areal referenced.

Since most catchments in the world are ungauged (i.e. without runoff observations), a common task for hydrologists is to predict runoff in these catchments. In this article we consider predictions of runoff which is a key hydrological signature. The annual runoff, both long term average and the year-to-year variation, provides information about the total amount of water available in an area of interest and is fundamental for water resources management, i.e. planning of domestic, agricultural, and industrial water supply, and allocation of water between stakeholders. The variability in annual runoff is also a key quantity for understanding its sensitivity to driving climatic factors in today’s climate and provides robust assessments of runoff variability in a future climate. The annual runoff is commonly used as a key variable when predicting other runoff properties in ungauged catchments, i.e. low flows and floods (McMahon et al., ).

There are several approaches to predict runoff in ungauged catchments, e.g process-based methods (Beldring et al., 2003) and geostatistical methods (Gottschalk, 1993; Sauquet et al., 2000; Skøien et al., 2006). In this article, we choose a geostatistical approach. Within the geostatistical framework, runoff predictions in ungauged catchments has typically been done by interpolation of areal referenced runoff data from nearby catchment by using Kriging methods (see e.g Skøien et al. (2006) or Sauquet et al. (2000)). This has shown promising results. In these methods, precipitation data have often been avoided as an information source because these data are known to be uncertain and/or biased (see e.g Neff (1977), Groisman and Legates (1994) or Wolff et al. (2015)). Evaporation data are even more uncertain: It is seldom observed directly, but are derived from meteorological observations and process-based models like in e.g Mu et al. or Zhang et al. (2009). In spite of large uncertainties linked to precipitation and evaporation measurements, precipitation and evaporation are the main drivers behind runoff and it is reasonable to believe that these data sources can contribute to an increased understanding of the runoff generating process if used cleverly. Particularly in areas with few streamflow observations.

Motivated by this, we present a Bayesian geostatistical model for annual runoff where we in addition to runoff data, utilize precipitation and evaporation data for performing spatial interpolation. The suggested model is a Bayesian hierarchical model where the observation likelihood consists of areal referenced runoff observations from catchments and/or point observations of runoff, where the point observations are annual evaporation subtracted from annual precipitation. Informative priors based on expert knowledge are used on the measurement uncertainties to express our doubt on the precipitation and evaporation data, and to put more weight on the runoff observations that are considered more reliable.

The catchments we consider in this article are located around Voss in Western Norway. Voss is a mountainous area, and the areas west for Voss are among the wettest in Europe with annual precipitation around 3 m/year. This makes Voss flood exposed, and accurate runoff models are of high importance. Voss is also a challenging area when it comes to runoff estimation because the spatial variability in the area is large and the density of stream gauges is low. However, there are several precipitation gauges in the area that can be exploited to increase the hydrological understanding. This makes the Voss area a good candidate for performing spatial interpolation of runoff by also including precipitation and evaporation data.

The large annual precipitation in Western Norway is mainly caused by the orographic enhancement of frontal precipitation formed around extratropical cyclones. The orographic enhancement is explained by the steep mountains that create a topographic barrier for the western wind belt, which transports moist air across the North Atlantic (Stohl et al., 2008). The topography and the elevation differences results in prominent patterns in precipitation and runoff.

Motivated by the strong orographic effect, we include a spatial effect in the model that is constant in the time period in which we have runoff observations. This represents the runoff generation caused by the climate in the study area. Furthermore, it is reasonable to assume that not all of the spatial variability can be explained by the climate, and we include an additional spatial effect to describe the annual discrepancy from the climate.

The climatic part of the model is important because it enables us to gain knowledge about which part of the spatial pattern that can be explained by the climate in the study area. Consequently, it also provides valuable information about the mean annual runoff that can be expected in the future with corresponding uncertainty estimates. The climatic component also provides a framework for exploiting short records of data by learning how the short records vary relatively to longer data series from nearby catchments. This is valuable because sparse datasets are common in hydrology, and there are several studies on how short records of runoff can be utilized to estimate different hydrological signatures (Fiering, 1963; Laaha and Blöschl, 2005).

In geostatistical interpolation methods like Kriging, distances between observations and locations of interest are utilized to make inference about the covariance structure of the process under study (Cressie, 1993). In hydrological applications it is common to measure the distance between two catchments by the Euclidean distance between the centroids of the involved catchments or the stream gauges (see e.g Merz and Blöschl (2005) or Skøien et al. (2003a)). However, this approach might lead to a violation of basic conservation laws: A significant property of catchments is that they are organized into subcatchments, and for annual runoff the water-balance must be conserved for all subcatchments: The total amount of annual runoff in a subcatchment cannot be larger than the total annual runoff in the main catchment. Thus, the measure of distance between catchments should depend on whether the involved catchments overlap or not. In the Top-Kriging approach developed by Skøien et al. (2006) the latter is taken into account by weighting information from a subcatchment more than information from a nearby non-overlapping catchment. The Top-Kriging approach outperformed other methods in predicting several hydrological signatures in Austria (Viglione et al., 2013).

Our approach to interpolation of annual runoff is similar to the Top-Kriging approach by that the nested structure of the catchments is taken into account. Furthermore, the water-balance is preserved for any point in the landscape by defining annual runoff in a catchment as an integral of the point runoff over a catchment area, and we get a model for which areal observations from nested catchments and point observations can be consistently combined.

Making inference and predictions with geostatistical models often lead to computational challenges as the computational speed scales cubically with the number of observations. Our solution to the computation challenges is to utilize the SPDE-approach to spatial modeling from Lindgren et al. (2011). Lindgren et al. (2011)

utilizes that a Gaussian random field (GRF) with Matérn covariance function can be expressed as the solution of a Stochastic partial differential equation (SPDE). By approximating the solution of the SPDE by using finite elements, a GRF can be expressed as a Gaussian Markov Random Field (GMRF). The GMRF approximation enables fast simulation and inference

(Rue and Held, 2005), and Integrated nested Laplace approximations (INLA) can be applied (Rue et al., 2009).

The main contribution of this paper is to present a computational feasible geostatistical model for annual runoff that consistently combines data of different spatial support. The suggested model deviates from traditional models in hydrology by utilizing precipitation and evaporation data for runoff interpolation. Furthermore, it is a flexible model that can be applied for both spatial interpolation and to gain knowledge about the mean annual runoff we can expect in the future with uncertainty estimates. It also contains a framework for exploiting short records of data.

In the section that follows we present of the study area and available data. Next, we introduce the theoretical background needed to develop the suggested runoff model that is presented in Section 4. In Section 5 the suggested model is fitted to the Voss data. Based on some observation schemes described in Section 5.1, the predictability of annual runoff in Voss is evaluated and discussed. To further explore the properties of the suggested model, a simulation study is carried out. This is presented in Section 6, before our key findings are discussed in Section 7.

2 Study area and data

(a) The study area is located around Voss in Western Norway.
(b) Mean annual observations.
Figure 1: Mean annual runoff from 5 catchments and mean annual precipitation minus evaporation (m/year) at 15 precipitation gauges for 1988-1997. Catchment 3 is a subcatchment of Catchment 4 and 5, while Catchment 4 is a subcatchment of Catchment 5. Catchment 1 and Catchment 2 don’t overlap with any of the other catchments.

When modeling hydrological processes on an annual scale, it is common to use the hydrological definition of a year. The basic water balance equation is given as , where is precipitation, is runoff, is evapotranspiration and S is change in stored water (i.e. snow, or groundwater). A hydrological year is defined such that the storage component in the water balance equation can be neglected, i.e. is much smaller than and . In Norway a hydrological year starts September 1st and ends August 31st, e.g. 1988 begins September 1st 1987 and ends August 31st 1988.

In this analysis, we have runoff data from the hydrological years 1988-2014. The dataset was provided by the Norwegian Water Resources and Energy directorate (NVE) and consists of annual runoff observations from five catchments where three of them are nested (see Figure 1). The unit of the data is m/year and and gives the spatial average of the runoff within a catchment. The observations from 1988-1997 are used to make statistical inference, while the observations from 1998-2014 are used as a test set for assessing the model’s ability to predict runoff for future years.

The annual runoff is calculated from daily runoff measurements. The stream gauges located in the catchments don’t measure runoff directly, but the river’s stage. The runoff observations are obtained by using a rating curve which gives the relationship between the stage of the water and the discharge or runoff at a specific point in the stream for a specific river. The stage-discharge relationship is developed empirically by measuring the discharge across a cross-section of the river for a range of stream stages.

Errors in observed runoff is composed of errors in river stage measurement and the rating curve model. On an annual time scale, the measurement errors tends to average out, and the main contribution to originates from uncertainties in the rating curve model. The dataset provided by NVE includes an estimate of the standard deviation of the observation uncertainty for each (annual) runoff observation, and the standard deviations are relatively small ranging from 0.65 % to 3.2 % of the corresponding observed value. This information is used to make informative priors for the measurement uncertainty in Section

4.3. We refer to Reitan and Petersen-Øverleir (2009) for details on how the observation (rating curve model) uncertainty is obtained Reitan and Petersen-Øverleir (2009).

In addition to runoff data, we have precipitation data from 15 precipitation gauges. Daily precipitation data were downloaded from www.eKlima.no which is a web portal maintained by the Norwegian Meteorological Institute. The observations were aggregated to annual values for the hydrological years 1988-1997. The observed precipitation ranges from 0.55 m/year to 4.6 m/year.

The evaporation data used, originate from the satellite remote sensing-based evapotranspiration algorithm presented in Zhang et al. (2010). The dataset we use is an spatially aggregated version of the original Zhang et al. (2010) dataset and consists of global monthly land surface evapotranspiration with spatial resolution of 1 degree (longitude, latitude). Evaporation data for the locations of the precipitation gauges around Voss were extracted, and the monthly values were aggregated to hydrological years (1988-1997). As the spatial resolution of the dataset is 1 degree and the study area is rather small, the observed annual evaporation within a specific year is the same for almost all of the precipitation gauges. The observed evaporation ranges from 0.23-0.32 m/year with mean 0.25 m/year and standard deviation 0.02 m/year. This means that the approximately of the annual precipitation evaporates, which is a relatively small amount on world basis. The observations of evaporation must be considered as approximative estimates of the actual evaporation in the area of interest, and there are large uncertainties connected to these values.

Figure 1 shows the 5 catchments in which we have measurements of runoff and the locations of the 15 precipitation gauges. Mean annual values for areal referenced runoff and point referenced runoff (precipitation-evaporation) for 1988-1997 are included. The spatial pattern caused by the orographic precipitation is visible as high values of annual runoff in the western part of the study area and low values in the eastern part. This spatial pattern is prominent for all years where we have data, and indicates that the climate in the area is dominating over annual effects.

3 Background

We propose a Latent Gaussian model (LGM) for annual runoff that is computational feasible due to a stochastic partial differential equation (SPDE) formulation of Gaussian random fields (GRFs). In this section we give a brief introduction of these concepts and other relevant background theory and notation for developing and evaluating the model for annual runoff that is presented in Section 4.

3.1 Latent Gaussian Models

In this article we suggest a Latent Gaussian model (LGM) for combining point and areal observations of annual runoff. An LGM can be represented in a hierarchical structure consisting of three levels (see e.g Gelman et al. (2004)). The first level is the observation likelihood, in this case consisting of two data types and . The data are observed with conditional independent likelihood given two linear predictors and , and some parameters (,

) which we refer to as hyperparameters. The two linear predictors depend on the same set of latent variables

, but connects the data to the latent field differently, through different projection matrices, e.g and . Here, and are a matrices that links elements in the latent field to the observations, and and denote row number and of the two matrices. The second level of the LGM is formed by the prior of the latent field which is on the form i.e it is Gaussian conditioned on some hyperparameters . The third level is given by which is the prior distribution of the hyperparameters .

3.2 Gaussian random fields

We utilize Gaussian random fields (GRFs) to model the spatial variability of annual runoff. A continuous field defined on a spatial domain is a GRF if for any collection of locations

the vector

follows a multivariate normal distribution

(Cressie, 1993), i.e . The covariance matrix defines the dependency structures in the spatial domain, and is typically constructed from a covariance function

. Furthermore, the dependency structure for a spatial process is often characterized by two parameters: The marginal variance

and the range . The marginal variance provides information about the spatial variability of the process of interest, while the range provides information about how the correlation between the process at two locations decays with distance. The range is defined as the distance for which the correlation between two locations in space has dropped to almost 0. If the range and marginal variance are constant over the spatial domain, we have a stationary GRF.

One popular choice of covariance function is the Matérn covariance function which is given by

(1)

where is the Euclidean distance between two locations , is the modified Bessel function of the second kind and order , and is the marginal variance (Guttorp and Gneiting, 2006). The parameter is the scale parameter, and it can be shown empirically that the spatial range can be expressed as , where is defined as the distance where the correlation between two locations has dropped to 0.1 (Lindgren et al., 2011). Using a Matérn GRF is convenient because it enables us to apply the SPDE approach to spatial modeling which is outlined in the next subsection.

3.3 The SPDE approach to spatial modeling

Making statistical inference and predictions on models including GRFs involve matrix operations on the covariance matrix . This can lead to computational challenges if the covariance matrix is dense. A solution to this problem is to utilize that the exact solution of the SPDE

(2)

is a Gaussian random field with Matérn covariance function. Here,

is spatial Gaussian white noise,

is the Laplacian, is a smoothness parameter, is the scale parameter in (1), is the dimension of the spatial domain and is a parameter controlling the variance. The parameters of the Matérn covariance function in (1) is linked to the SPDE through

where we will use that and set , such that is fixed to .

The above result, developed by Whittle (1954, 1963), is used by Lindgren et al. (2011) to show that a GRF can be approximated by a Gaussian Markov random field (GMRF). A GMRF is simply a multivariate Gaussian vector which is parametrized by the precision matrix , that is the inverse of the covariance matrix. The term GMRF is most used for Gaussian processes with sparse precision matrices, i.e matrices that contain many zero elements. The zero elements corresponds to Markov properties, in this case conditional independence between locations in the spatial domain. It is convenient to work with GMRFs due to computationally efficient algorithms for sparse matrix operations (Rue and Held, 2005).

The GMRF approximation described in Lindgren et al. (2011) is enabled by solving the SPDE in (2) by the finite element method (FEM). This consists in expressing the solution of the differential equation through a finite basis function representation

(3)

where is a set of weights and is a set of deterministic basis functions (see e.g Brenner and Scott (2008)). In this case, the

’s are Gaussian distributed with zero mean and precision matrix

. The basis functions are chosen such that the representation in (3) approximates the distribution of the solution to the SPDE, and such that the basis functions provides a sparse precision matrix . To achieve this, the spatial domain is discretized into non-intersecting triangles with nodes. Next, a set of basis functions is constructed on the triangulation; They are given local support and are piecewise linear in each triangle, i.e is 1 at node and 0 at all other nodes. The precision matrix of the weights can then be expressed as

(4)

when . Here, is a diagonal matrix with element () given by and is a sparse positive semi-definite matrix with element () given by . The resulting precision matrix is sparse with conditional independence structure, such that the weights now form a GMRF with computational benefits.

PC-priors

As we use a Bayesian approach, the hyperparameters from Section 3.1 must be given prior distributions. For the majority of the hyperparameters we use penalised complexity (PC) priors. PC-priors are proper prior distributions developed by Simpson et al. (2017). The main idea behind PC-priors is to penalise the increased complexity induced by deviating from a simple base model. One of the goals is to avoid overfitting.

The PC-prior for the precision of a Gaussian effect has density

(5)

where is a parameter that determines the penalty of deviating from the base model. The parameter

can be specified through a quantile

and probability

by , where , and . Here, is the standard deviation of the Gaussian distribution.

As the range and the marginal variance are easier to interpret than the SPDE parameters and , we parametrize the precision matrix in (4) as in our analysis, and assign prior distributions to and instead of and . For and we use the prior suggested in Fuglstad et al. (2015). This is a joint prior for the spatial range and the marginal variance constructed from PC-priors. The joint prior can be specified through

where , , and are quantiles and probabilities determined by the user.

Evaluating the predictive performance

To evaluate the predictive performance of the suggested runoff model, we use two criteria: The first criterion is the root mean squared error (RMSE). The RMSE measures the difference between a point prediction and the value actually observed by

where is the total number of pairs of predictions and observations. We use the posterior mean as point prediction. The second criterion is the continuous ranked probability score (CRPS). The CRPS is defined as

where is the predictive cumulative distribution and is the actual observation (Gneiting and Raftery, 2007)

. The CRPS takes the whole posterior predictive distribution into account, not only the posterior mean, and is penalized if the actual observation falls outside the posterior predictive distribution. Both RMSE and CRPS are negatively orientated, and a smaller value indicates a better prediction.

4 Statistical Model for Annual Runoff

In this section we present a LGM for annual runoff which is suitable for combining observations of different spatial support. Inference is computational feasible due to the SPDE approach to spatial modeling.

4.1 Spatial model for runoff

Let the spatial process denote the runoff generating process at point location in the spatial domain in year . The true runoff generation at point location is modeled as

(6)

Here, the parameter is an intercept common for all years , while is a spatial effect common for all years. These two model components represent the runoff generation caused by the climate in the study area. Further, we include a year specific intercept and a year specific spatial effect to model the runoff generation due to annual discrepancy from the climate. Both spatial effects are modeled as GRFs with zero mean and Matérn covariance functions given the model parameters; with range parameter and marginal variance , and with range and marginal variance . The spatial fields , j,…,r, are assumed to be independent realizations, or replicates of the underlying GRF. The same applies for the year specific intercepts which are assumed to be independent and identically distributed as given the parameter with being independent realizations of this Gaussian distribution.

The true mean runoff generated inside a catchment in year can be expressed as

(7)

where is the area of catchment . By interpreting catchment runoff as an integral of point referenced runoff , we obtain a mathematically consistent model where the water-balance is conserved for any point in the landscape.

4.2 Observation model

Annual precipitation and evaporation are observed at locations for . The observed annual runoff generation at point location , year , is modeled as the difference between the observed annual precipitation and annual evaporation ,

(8)

where is the true annual point runoff from Equation (6). The error terms are independent and identically distributed as and independent of the other model components. The measurement uncertainties for precipitation and evaporation are assumed to increase with the magnitude of the observed value, and we want to include this assumption in the model. This is done by scaling the precision parameter of the error terms with a fixed factor , which is further described in Section 4.3.

Runoff at catchment level is observed through streamflow data from catchments denoted . We use the following model for the annual runoff observed in catchment in year

(9)

where is the true annual areal runoff from (7). The measurement errors are independent and identically distributed as and independent of the other model components. As for the point referenced observations, the precision parameter of the error terms is scaled with a fixed factor which is further described in the next subsection.

4.3 Prior distributions

The variance of the measurement error of a point referenced observation from precipitation gauge , year , is given by where is a hyperparameter and is a deterministic value that scales the variance based on expert knowledge from NVE about the measurement errors of precipitation and evaporation. The scales allow each observation to have its own prior uncertainty.

The precipitation data are obtained by observing the amount of water or snow that falls into a bucket, and in particular for windy snow events the buckets can fail to catch a large proportion of the actual precipitation (Neff, 1977; Groisman and Legates, 1994; Wolff et al., 2015). Based on this, the standard deviation of the observation uncertainty for precipitation is assumed to be of the observed value . The evaporation data are obtained from satellite observations and process-models, and are more uncertain than the precipitation data. We assume that the standard deviation for evaporation is of the observed value . The prior knowledge about the point data is used to specify the scale for a point observation at location and year as follows

Here, the covariance between the observed precipitation and evaporation is estimated by

where is the Pearson correlation between all available observations of precipitation and evaporation at precipitation gauge . Further, we assign the PC-prior from Equation (5) to the precision with and . With this prior for the precision

, the prior 95 % credible interval for the standard deviation

of the measurement error for point runoff becomes approximately (0.002-30)% of the corresponding observed value . This interval corresponds well to what we know about the uncertainty of precipitation and evaporation observations.

The same approach is used to assign a prior to the variance of the measurement error for the areal referenced observations . The precision is given a PC-prior with and , while the scale for catchment , year is given by

(10)

For the streamflow data, information about the variance of the observations is directly available through the dataset provided by NVE. These data are inserted into (10). With the suggested prior, the prior 95 % credible interval for the standard deviation of an areal observation, is approximately (0.002,4.0) of the corresponding observed value . This is an informative prior that just covers the range of values suggested by NVE. A low prior standard deviation with narrow prior confidence bands contributes to put more weight on the runoff observations than to the point observations. There are only 5 areal observations available for each year in the dataset, but 15 point observations, and the aim is to avoid that the more unreliable point data dominate over the areal data.

For the spatial ranges and the marginal variances of the spatial fields and , the joint PC-prior from Fuglstad et al. (2015) is used. The PC-priors for , , and are specified through the following probabilities and quantiles:

The percentages and quantiles are chosen based on knowledge about the spatial variability in the area. The study area is approximately 80 km 80 km, and it is reasonable to assume that there is a significant correlation between two locations that are less than 10 km apart. The spatial variability in the study area is large, and we can observe runoff values from 0.8 m/year to 3.2 m/year within the same year. However, it is reasonable to assume that the marginal standard deviation of the runoff generating process does not exceed 2 m/year. The parameters of the climatic GRF and the annual GRF are given the same prior as it is difficult to identify if the spatial variability mainly comes from climatic processes or from annual variations. We also want the data to decide which of the two effects that dominates in the study area.

As described in Section 4.1, the year specific intercept is for all years . Its precision is given the PC-prior from (5) with and . This is a weakly informative wide prior with a prior interval (0.002,40.5) for the standard deviation of . Finally, the climatic intercept is given a normal prior, . This gives a prior confidence interval of (1.0,3.0) m/year for which covers all reasonable mean values of annual runoff around Voss.

4.4 Inference

In order to make the model computational feasible, some simplifications of the suggested model are necessary. In Section 4.1 the annual runoff for a catchment was modeled as the integral of point referenced runoff over the catchment area. In practice, the integral in Equation (7) is calculated by a finite sum over a discretization of the target catchment. In particular, let denote the discretization of catchment . The total annual runoff in catchment in year is approximated as

(11)

where is the total number of grid nodes in and is the point runoff at grid node . It is important that a subcatchment shares grid-nodes with the main catchment in order to preserve the water-balance. The discretization used in this analysis has 1 km spacing and is shown in Figure 1(a).

The model suggested for annual runoff, is a latent Gaussian model with the structure described in Section 3.1

. Modeling annual runoff as a LGM is convenient because it allows us to use Integrated Nested Laplace approximations (INLA) to make inference and predictions. INLA can be used for making Bayesian inference on LGMs and is a fast and approximative alternative to MCMC algorithms. The approach is based on approximating the marginal distributions by using Laplace or other analytic approximations, and on numerical integration schemes. The main computational tool is the sparse matrix calculations described in

Rue and Held (2005), such that in order to work fast, the latent field of the LGM should be a GMRF with a sparse precision matrix. In our case, sparsity is obtained by utilizing the SPDE approach from Section 3.3 to express the GRFs and through the finite basis-function representation in Equation (3) on a triangulation of the spatial domain. The triangulation used is shown in Figure 1(b).

The R-package r-inla was utilized to make inference and predictions on the suggested model. This package provides a user-friendly interface for applying INLA and the SPDE approach to spatial modeling without requiring that the user has deep knowledge about SPDEs. See r-inla.org or Blangiardo and Cameletti (2015) and Krainski et al. (2017) for tutorials and examples. In particular, Moraga et al. (2017) is recommended for a description of how a model with point and areal data can be implemented in r-inla.

(a) Discretisation of catchments. We use a regular grid with 1 km spacing.
(b) Triangulation of the spatial domain. The red points are the locations of the precipitation gauges.
Figure 2:

5 Case study of annual runoff in Voss

The model presented in Section 4 is utilized to explore the predictability of annual runoff in the Voss area. The predictability is investigated through four tests that are presented in the following subsection. The tests are created to gain knowledge about the runoff generating process around Voss and are inspired by common applications in hydrology. In Section 5.2 the results from the tests are presented and discussed.

5.1 Model evaluation

To explore the predictability of runoff around Voss, we compare three observation designs: An observation design where only point referenced observations are included in the likelihood (), an observation design where only areal referenced observations are included in the likelihood () and an observation design where all available observations are included in the likelihood (). Utilizing only areal observations () corresponds to what is typically done in hydrological applications, and we want to explore if we can improve the predictability of runoff by also including point observations in the likelihood (). Including as an observation design gives insight into what influence the point data have on the predictions. The three observation designs are evaluated according to four tests that are described as follows:

T1 - Inference: The model from Section 4 is fitted to all available observations between 1988 and 1997. In the likelihood there are observations of runoff from five catchments and/or observations of evaporation and precipitation from 15 gauges from the hydrological years 1988-1997, i.e the number of point observations is 15, the number of areal observations is 5 and the number of replicates is 10. The test is performed to gain knowledge about how , and affect the posterior estimates of the parameters.

T2 -Spatial predictions in ungauged catchments: In hydrologic applications, the main interest is in estimating runoff at catchment level, not at point level. Motivated by this, we perform spatial predictions of annual runoff in each of the five catchments in the dataset. The data from the target catchment is left out, i.e the catchment of interest is treated as ungauged which is the typical setting in hydrology. Runoff predictions are done for the target catchment for 1988-1997 and are based on observations from the remaining 4 catchments and/or point data from 1988-1997, i.e we do predictions in space, not in time. The predictive performance is assessed by computing the RMSE and CRPS for each catchment based on the 10 years of predictions, where the posterior mean runoff is used as point predictor for the RMSE-calculations. We use the average RMSE or CRPS over all catchments as a summary score.

T3u- Future predictions in ungauged catchments: In T2 we gain knowledge about the runoff that has been generated in ungauged catchments in the past. However, gaining knowledge about the annual runoff we can expect in the future is more interesting for most hydrological applications. In T3u we therefore estimate annual runoff for a future year, i.e for a year for which there are no observations of runoff, precipitation or evaporation. For an unobserved year () the posterior means of the year specific effects and are zero. Thus, the posterior predicted future runoff is given by the posterior means of the climatic effects and . However, all model components contribute to the predictive uncertainty.

In T3u the catchment of interest is treated as ungauged and left out of the dataset, and we use the remaining observations from 1988-1997 to predict annual runoff for 1998-2014 in . The predictive performance is evaluated by computing the RMSE and CRPS based on predictions of runoff in all 5 catchments for 17 future years.

T3g: Future predictions in partially gauged catchments: We predict annual runoff in for a future year as in T3u. However, we allow the observation likelihood to contain 1 to 10 years of runoff observations from the catchment in which we want to predict runoff. This way, we explore the model’s ability to exploit short records of runoff.

The test is carried out by drawing observations between 1988-1997 randomly from the target catchment. Next, these observation are used together with the other point and/or areal observations of , and from 1988-1997 to predict the annual runoff in 1998-2014 for this particular catchment. The experiment is repeated 10 times such that we include different years of observations from the target catchment from one experiment to the next. The predictive performance is evaluated by computing the RMSE and CRPS based on predictions of runoff in for the 17 future years, and we report the average over the 10 randomly drawn configurations of years. The above procedure is carried out with increasing number of data in the short record, i.e. for .

5.2 Results from the case study

(a) Posterior mean [m/year].
(b) Posterior standard deviation [m/year].
Figure 3: Posterior mean and standard deviation for annual runoff for a future unobserved year when all available observations of point observations (, left), areal (, middle) and both point and areal observations (, right) from 1988-1997 are used, i.e all catchments are treated as gauged for and , but ungauged for (test T1).

Table 1 shows the posterior medians and the 0.025 and 0.975 quantiles for the hyperparameters for (point observations), (areal observations) and (point and areal observations) when all respective available observations from 1988-1997 are utilized to make inference (T1). In general, gives lower runoff values with posterior median of the climatic intercept equal to 1.87 m/year compared to giving equal to m/year. Furthermore, the posterior median of the marginal standard deviation of the climatic GRF is considerably larger for with m/year compared to and which give posterior medians 0.63 m/year and 0.76 m/year respectively. The posterior median of the range of the climatic GRF is also larger for with 70.1 km compared to values around 20 km for and .

The spatial pattern of the posterior mean and posterior standard deviation corresponding to these parameter values is shown in Figure 3. These figures show the posterior mean and standard deviation for runoff for an unobserved, future year. We see that larger values for and lead to a more prominent spatial pattern for with large runoff values in the western part of the study area and lower values in the eastern part. A large climatic range also leads to a lowering of the posterior predictive uncertainty in a larger part of the study area for , as can be seen in Figure 2(b). The maps show that the choice of observation scheme has a large impact on the resulting predictions of annual runoff in terms of posterior mean and/or posterior standard deviation.

Parameter Posterior median (0.025 quantile, 0.975 quantile)
P A P+A
236.4 (148, 380) 104.5 (32.4,262.4) 101.9 (40.9,248.5)
0.27 (0.20, 0.38) 0.34 (0.18,0.56) 0.29 (0.19, 0.44)
70.1 (29.7,180.4) 25.1 (8.8, 73.8) 20.1 (9.1, 45.6)
0.97 (0.56,1.79) 0.63 (0.34,1.34) 0.76 (0.53, 1.1)
1.87 (1.1, 2.7) 2.21 (1.6, 2.8) 1.96 (1.4, 2.5)
4.3 (3.0, 6.2) 7.5 (3.2, 17.9)
0.079 (0.036, 0.18) 0.04 (0.02, 0.07)
19.3 (2.1, 479.1) 3.0 (0.83, 8.3) 4.8 (1.5, 14.2)
Table 1: Posterior median (0.025 quantile, 0.975 quantile) when all available point (P), areal (A) and both point and areal (P+A) referenced observations from 1988-1997 are utilized for making inference (test T1).

In T2 we perform spatial predictions of annual runoff in 1988-1997 in a catchment that is left out of the dataset. The predictive performance for spatial predictions is summarized in Figure 4. For four out of five catchments, gives the lowest RMSE and CRPS, or a RMSE and CRPS that is approximately as for or .

(a) RMSE.
(b) CRPS.
Figure 4: Predictive performance for spatial predictions of runoff in 1988-1997 when the target catchment is treated as ungauged (test T2). Dashed lines mark the averages performance over all catchments.

In particular, Catchment 3 is an interesting case. For Catchment 3, gives an RMSE around 0.9, while gives an RMSE around 0.4. The point observations close to Catchment 3 have mean values lower than the mean annual runoff in Catchment 3 and lead to underestimation of annual runoff as we can see in Figure 5. Also the areal observations lead to an underestimation of the annual runoff. Intuitively, we would expect that combining and for this catchment also would lead to predictions that underestimate the true observed annual runoff. Instead, we get a large improvement in the predictions when and are combined, with an RMSE around 0.1. The predictions for Catchment 3 also turn out to be larger than any of the nearby observations.

The result can be explained by the nested structure of the catchments in the dataset. Catchment 4 and Catchment 5 cover Catchments 3 and put constraints on the total runoff in this area. As Figure 1 shows, there are two precipitation gauges inside Catchment 5 for which the point runoff generated is lower than the mean annual runoff in the surrounding two catchments. To preserve the water balance, the predicted annual runoff in the remaining parts of Catchment 4 and Catchment 5 has to be larger than any of the values that are observed in the surrounding area. This interaction between nested areal observations and point observations makes the model able to correctly identify Catchment 3 as a wetter catchment than any of the nearby catchments, and shows that we have a geostatistical model that does more than smoothing.

                                                                

Figure 5: The posterior mean for spatial predictions in Catchment 3 (test T2) with corresponding 95 posterior prediction intervals for observation design (left), (middle) and (right).

This does not mean that the interaction between point and areal observations always lead to improved predictions (see e.g Catchment 5 in Figure 4). However, overall the results in Figure 4 show that on average we benefit from utilizing all available data () in the analysis when making spatial predictions, and that utilizing only point observations provides poor predictions. performs considerably worse than and for three of the catchments (Catchment 3, 4 and 5)

                                                                         

Figure 6: The posterior mean for spatial predictions in ungauged catchments for (left), (middle) and (right) compared to the corresponding observed value (T2).

The scatterplots in Figure 6 compare the spatial predictions from 1988-1997 (T2) to the actual observations for each (ungauged) catchment. Overall, observation design and provide predictions that are symmetric around the corresponding observed runoff. However, if we look more closely at the predictions for each catchment, we see that and tend to either overestimate or underestimate the annual runoff within a catchment. This is seen most clearly for Catchment 1 where the annual runoff is overestimated for and , and for Catchment 2 where the runoff is underestimated for . The results shows that the same systematic prediction error typically is done each year for a specific catchment. This is also seen for and in Figure 5. The biases are however small enough that the actual observations are covered by the corresponding posterior prediction intervals for and for most catchments. This can be seen in Table 2.

For the situation is different: Figure 6 shows that the annual runoff is underestimated for all catchments. In addition, the posterior standard deviation for runoff is typically unrealistically small for contributing to narrow posterior prediction intervals. Large biases combined with small posterior standard deviations lead to low coverage for the spatial predictions for , and on average the coverage of a posterior prediction interval is as low as . For , neither the posterior mean nor the posterior variance reflect the properties of the underlying process.

All
1 0.5 0.5 0.1 0 0.42
1 0.7 1 1 1 0.94
1 0.7 1 1 0.30 0.80
Table 2: The proportion of the observations that falls into the corresponding posterior prediction interval for spatial predictions of runoff (T2) in catchment for 1988-1997 when the target catchment is treated as ungauged.

In tests T3g and T3u annual runoff was predicted for unobserved future years (1998-2014) when 0-10 observations from the target catchment from 1987-1998 were included in the likelihood in addition to observations of and/or from other locations and catchments. The resulting predictive performance is visualized in Figure 7. As for the spatial predictions, gives the lowest RMSE and CRPS on average. For ungauged catchments (0 years of observations from the target catchment are utilized), and perform considerably worse than . However, when we include some years of observations from the target catchment, we see a large drop in the RMSE and CRPS for and . The posterior mean for a future year is given by the posterior mean of , i.e the plots show that we get a large change in the climatic part of the model when we include information from a new location or catchment. This indicates that the climatic part of the model has a large impact on the resulting predictions. Only a couple of observations from the target catchment was needed in order to make a large improvement in the predictive performance.

This result can be understood from the results of the parameter estimation in T1. Most of the spatial variability can be explained by the climatic GRF as the posterior median of its marginal standard deviation is approximately twice as large as the median of the marginal standard deviation for the annual GRF for all observation designs (Table 1).

The consequence of having a strong climate, is that an accurate prediction of annual runoff is characterized by an accurate estimation of , and if we lack important information about in parts of the spatial domain, systematic errors in the predictions are likely to occur for catchments in these areas every year. Such systematic errors were seen earlier in the results from the spatial predictions in Figure 5 and Figure 6. The model tends to either over- or underestimate the annual runoff which is a direct consequence of the climatic effect’s dominance over the annual effect. Another consequence of a strong climate, is that adding new data points to the observation likelihood can contribute to a large change or improvement in the resulting predictions as shown in Figure 7.

(a) RMSE.
(b) CRPS.
Figure 7: Predictive performance for future runoff (1998-2014) in catchment when 0-10 years of observations from the target catchment between 1988 and 1997 are included in the observation likelihood together with other observations of and/or (test T3).

6 Simulation study

For the real dataset, there is only one realization of the climate and the results indicated that this was the dominating part of the model. The strong climate can explain the systematic under- and overestimation that occurred for the runoff predictions in the case study. However, mismatch between the data and the model can be another explanation. In order to investigate this further, a simulation study was carried out. In the simulation study, data is simulated from the suggested model, and we explore if systematic under- and overestimation occur also for simulated data.

6.1 Set-up

In the simulation study, we simulate from the model described in Section 4 with , , , , and . These values are similar to the posterior medians in Table 1 for . Furthermore, we assume that the measurement errors of the point observations are normal distributed with standard deviation of the corresponding observed value. The measurement errors of the areal observations are normal distributed with standard deviation of the corresponding observation. Annual runoff was simulated for the points and the areas in Figure 1 such that we have the same setting as for the case study. In total 100 dataset were generated, i.e we have 100 climates, and each dataset has 10 years of observations as for the Voss dataset (1988-1997).

To explore the predictability of annual runoff for different climates, test T2 described in Section 5.1 was repeated for simulated data, i.e spatial predictions of annual runoff were performed for 10 years (1988-1997) in a catchment that was left out of the dataset. This was done for each of the five catchments in our study area for all 100 simulated climates.

6.2 Results from the simulation study

(a) RMSE.
(b) CRPS.
Figure 8: Predictive performance for spatial predictions of runoff in ungauged catchments for 1988-1997 for simulated data when the target catchment is treated as ungauged (T2). Dashed lines show the averages over all catchments.

The RMSE and CRPS for the spatial predictions (1998-1997) in five catchments and 100 climates are summarized in Figure 8. The results are as expected for simulated data: For all catchments performs best on average. performs better than , possibly because there are more point observations available (15 each year) than areal observations (4 each year). The coverage of the posterior prediction intervals are also as expected as shown in Table 3, and the predictions were unbiased with respect to the true runoff value. Over 100 different climates the model performs as it should asymptotically.

All
P 0.97 0.89 0.97 0.90 0.89 0.92
A 0.93 0.93 0.90 0.93 0.91 0.92
P+A 0.96 0.89 0.97 0.93 0.89 0.93
Table 3: The proportion of the true runoff values that falls into the corresponding posterior prediction interval for spatial predictions of runoff (T2) for simulated data.

Now, assume that we consider spatial predictions in one of the five catchments for one specific simulated climate. The fact that the spatial predictions are unbiased over 100 different climates doesn’t necessarily mean that the predictions are unbiased within a specific climate and catchment. In T2 we perform 10 runoff predictions (1988-1997) for each of the five catchments in the dataset, where the target catchment is treated as ungauged. The resulting posterior distributions give 95 % posterior prediction intervals for the 10 years of runoff predictions. The intuitive interpretation of these posterior prediction intervals is that conditional on the data and the model, there is a 95 probability that the simulated true runoff value is covered by its 95

posterior prediction interval. If this assumption is valid, we are dealing with a binomial distribution, i.e the probability that

out of 10 of the simulated true values are inside the corresponding posterior prediction interval for a catchment is binomial with =10 and .

(a) Ungauged catchment: The target catchment is entirely left out of the dataset.
(b) Partially gauged catchment: One observation from the target catchment is included in the likelihood.
Figure 9: The probability that 0-10 of the true values of annual runoff from 1988-1997 are inside the corresponding posterior prediction interval for spatial predictions within a specific catchment and climate (T2), estimated based on predictions in 5 catchments for 100 simulated climates (i.e 500 data points). Both point and area observations are utilized (), but similar results are obtained for and . The computed probabilities are compared to the binomial distribution with =10 and =0.95.

To gain knowledge about the predictability of annual runoff, we computed the proportion of times out of 10 of the simulated true values were inside the corresponding 95 posterior prediction interval for the 5 catchments and 100 climates. These empirical probabilities were compared to the binomial distribution and the results are shown in Figure 8(a). The results show clearly that the binomial assumptions are not valid for this setting. Considering the binomial distribution, the probability that all 10 runoff values are inside the corresponding posterior prediction interval is equal to 59.9 %. For the simulation study, this probability was as high as 82 % for . Further, the probability that 3 values or fewer are inside the prediction interval is close to zero for the binomial distribution. This was far more probable for the simulation study, giving 2.6 % 7.2 % and 3.6 % for , and respectively.

The empirical probabilities in Figure 8(a) indicate strong dependencies between the different years of predictions or observations: If we obtain an accurate prediction one year, we typically obtain accurate predictions for other years as well, and 10 out of 10 of the true runoff values are covered by the corresponding posterior prediction interval. Similarly, if we obtain a poor prediction one year for a specific catchment, we obtain poor predictions for all years and almost all values are outside the corresponding prediction interval.

To get an indication of the biases involved in the spatial predictions, we calculate the proportion of cases for which all 10 years of predictions for a particular climate and catchment were either below (underestimated) or above (overestimated) the true simulated runoff value. This turned out to be a common event and occurred for of the cases for , in of the cases for and in of the cases for . Thus, systematic over- and underestimation occur for the simulated data as for the real data in Section 5.2.

From a statistical point of view, the above results are intuitive: The large dependencies between years occur because the climatic effect that is common for all years dominates in the model. Different years of predictions for a particular catchment only differ by an annual effect that was shown to be small. This dependency structure make systematic underestimation or overestimation of runoff likely. The consequence of this result, is that the posterior prediction intervals for runoff in ungauged catchments should be interpreted with care in areas with a stable hydrological regime. If we consider all ungauged catchments on the globe, we can expect that of the true runoff values are inside the corresponding posterior prediction intervals on average, but if we consider predictions for individual catchments, a large proportion of the predictions will be biased in one direction or the other.

To further investigate the impact the climate has on the predictability of annual runoff, another experiment with spatial predictions was carried out. This time we perform test , but treat the target catchment as partially gauged: One single runoff observation from the target catchment from a random chosen year was included in the observation likelihood together with the other observations from and/or . The empirical probabilities that of the true runoff values were inside the corresponding 95 % posterior prediction interval was re-computed, and the results are shown in Figure 8(b).

Figure 8(b) shows that this time, the empirical probabilities are close to the binomial probabilities. A large reduction in prediction bias was also detected: The proportion of cases for which all 10 predictions for a particular climate and catchment were either below or above the true runoff value was found to be around for , and , a considerably smaller proportion than what was obtained when the catchments were treated as totally ungauged (around ).

The latter results show that in order to increase the predictability of runoff in an area with a stable hydrological regime as in Voss, there is much to gain from adding only one observation from the catchment of interest, a similar result to what we found in Figure 7 for the case study. Since the hydrological regime is stable over time, observing the target catchment for one year together with nearby catchments with long time series provides much information about the target catchment also for the other years. The binomial assumptions are more valid for the partially gauged case, meaning that the 95 posterior prediction intervals become easier to interpret for individual catchments when the catchments are partially gauged.

7 Discussion

The objective of this paper was to present a model for annual runoff that consistently combines data of different spatial support. A geostatistical model consisting of a climatic effect and an annual effect was presented, and we have demonstrated that we have a flexible model that can be used for for making runoff predictions in ungauged catchments, to gain knowledge about the annual runoff we can expect in the future and that is able to exploit short records of data.

The focus of the study was on exploring the model’s predictability, and there are three key findings: 1) On average we benefit from including all available observations in the likelihood, both point and areal data. performed better than and in terms of RMSE, CRPS and the coverage of the 95 posterior prediction intervals, both for the case study and for simulated data. 2) The suggested model is particularly suitable for modeling the nested structure of catchments. The case study showed that the model was able to identify Catchment 3 as a wetter catchment than any of the surrounding catchments and precipitation stations. This was a consequence of utilizing information from two overlapping catchments to constrain and distribute the annual runoff correctly. The interaction between the point and nested areal observations gives a geostatistical model that does more than smoothing. 3) The climate has a large influence on the predictive power of the suggested model. As the climatic effects are dominating over the annual effects, an accurate prediction of annual runoff implies an accurate representation of the underlying climate, and systematic biases typically occur if this representation is poor.

The fact that performed better than for most catchments around Voss, indicates that the point and areal observations of runoff were sufficiently compatible for most catchments, i.e that evaporation subtracted from precipitation was a valid approximation of point runoff. This interpretation of point runoff is reasonable in areas like Voss where the annual precipitation is considerably larger than the annual evaporation. The evaporation data are uncertain and should not make a large impact on the resulting predictions. In many areas of the world, the observed annual evaporation is more than 50 of the annual observed precipitation. In such areas, our framework could provide negative point observations and results that are hard to interpret.

Precipitation observations are often avoided as an information source when performing interpolation of runoff in hydrological applications, but the results presented here show that the point observations can contain valuable information when utilized together with areal observations. At least in data sparse areas with few streamflow observations. However, there is still room for improvement in the compatibility between the two observation types: The observation designs including only point observations provided a clear underestimation of annual runoff for most catchments in the case study. It was also seen that the spatial precipitation field was smoother than a spatial runoff field (Figure 2(a)), which is a typical result: The increase in spatial variability from precipitation to runoff is mainly explained by small scale variability introduced by the soils and vegetation (Skøien et al., 2003b). Consequently, if the point data are allowed to dominate over the areal data, the point data can result in a runoff field that is too smooth, which affects both the posterior mean and standard deviation negatively.

Furthermore, it is worth mentioning that all of the available precipitation gauges are located at a lower elevation than the mean elevation of the five catchments in the dataset. This is a common problem. Precipitation gauges are often located at low elevations, close to settlements where the gauges are easy to maintain. It is known that the amount of precipitation typically increases with elevation. Consequently, there is a lack of information about precipitation at high elevations in the data. Adding the fact that the precipitation gauges often fail to catch all precipitation, in particular snow combined with strong winds (Kochendorfer et al., 2017), essential information about the precipitation and runoff field could be lost. To solve the compatibility issues, elevation was considered as a covariate in a preliminary study (Ødegård, 2017), but this did not lead to significant improvements and the results are not included here.

Elevation is also known to be a factor that affects the spatial dependency structure of precipitation, and Voss is a mountainous area. The spatial range is typically larger in lowlands and decreases with elevation. A non-stationary model similar to the one presented in Ingebrigtsen et al. (2014) with a range and a marginal variance that changes with elevation could be considered. This can easily be implemented within the INLA-SPDE framework. However, in this case the dataset is small and the complexity of the spatial variability large. We also have a model with only one replication of the climatic spatial effect which was the dominating component. A non-stationary model would probably be too complicated and lead to identifiability issues (Ingebrigtsen et al., 2015).

Regardless of the increased complexity in an extended model, it is reasonable to believe that an accurate representation of climatic conditions is crucial when predicting annual runoff and other climate related variables. Systematic under- and overestimation of annual runoff will occur if we fail to characterize the underlying climate. In areas where most of the spatial variability can be explained by the climate, predictions from different years are strongly dependent as we in practice estimate the same underlying realization of the climate each year. This insight provides important knowledge about the interpretation of the posterior prediction intervals for runoff in ungauged catchments.

In spite of the large biases documented in this study, a strong climate also has positive consequences. In this article a model with a climatic component was suggested. The climatic component included a spatial effect that was common for all years of observations. The climatic component of the model made it relatively simple to exploit short records of data and the runoff predictions could easily be improved by including a few observations from the target catchment. Time series from several years are not needed, because one or two observations from a new catchment updates the climatic component that makes a large contribution to the final model. An alternative model with only annual components does not have this property. One new observation of annual runoff would only improve the predictions for that particular year, and not influence predictions for other years. For practitioners, a model with the described properties can be useful in situations where there exists one or few observations from a catchment of interest. This is a common setting for hydrological datasets. Short duration runoff observations are quite common, e.g. from planned short duration missions for water resources assessments, or from gauging stations that are closed after a revision of the gauging network. Large infrastructure projects measuring a few years of annual runoff for a relevant catchment is also achievable.

References

  • Beldring et al. (2003) S. Beldring, K. Engeland, L. A. Roald, N. R. Sælthun, and A. Voksø. Estimation of parameters in a distributed precipitation-runoff model for norway. Hydrology and Earth System Sciences, 7(3):304–316, 2003.
  • Blangiardo and Cameletti (2015) M. Blangiardo and M. Cameletti. Spatial and Spatio-temporal Bayesian Models with R-INLA. Wiley, 1st edition, 2015.
  • Brenner and Scott (2008) S. Brenner and L. Scott. The Mathematical Theory of Finite Element Methods, 3rd Edition. Vol. 15 of Texts in Applied Mathematics. Springer, 2008.
  • Cressie (1993) N. Cressie. Statistics for spatial data. J. Wiley & Sons, 1993.
  • Fiering (1963) M.B. Fiering. Use of correlation to improve estimates of the mean and variance. USGS Publications Warehouse, 1963.
  • Fuglstad et al. (2015) G.-A. Fuglstad, D. Simpson, F. Lindgren, and H. Rue. Interpretable Priors for Hyperparameters for Gaussian Random Fields, 2015.
  • Gelman et al. (2004) A. Gelman, J. B. Carlin, H. S. Stern, and D. B. Rubin. Bayesian Data Analysis. Chapman and Hall/CRC, 2nd ed. edition, 2004.
  • Gneiting and Raftery (2007) T. Gneiting and A. E. Raftery. Strictly Proper Scoring Rules, Prediction, and Estimation. Journal of the American Statistical Association, 102:359–378, 2007.
  • Gottschalk (1993) L. Gottschalk. Interpolation of runoff applying objective methods. Stochastic Hydrology and Hydraulics, 7:269–281, 1993.
  • Groisman and Legates (1994) P.Y. Groisman and D. R. Legates. The Accuracy of United States Precipitation Data. Bulletin of the American Meteorological Society, 75:215–227, 1994.
  • Guttorp and Gneiting (2006) P. Guttorp and T. Gneiting. Studies in the history of probability and statistics XLIX on the matérn correlation family. Biometrika, 93(4):989–995, 2006.
  • Ingebrigtsen et al. (2014) R. Ingebrigtsen, F. Lindgren, and I. Steinsland. Spatial models with explanatory variables in the dependence structure. Spatial Statistics, 8:20 – 38, 2014.
  • Ingebrigtsen et al. (2015) R. Ingebrigtsen, F. Lindgren, I. Steinsland, and S. Martino. Estimation of a non-stationary model for annual precipitation in southern Norway using replicates of the spatial field. Spatial Statistics, 14:338 – 364, 2015.
  • Kochendorfer et al. (2017) J. Kochendorfer, R. Rasmussen, M. Wolff, B. Baker, M. E. Hall, T. Meyers, S. Landolt, A. Jachcik, K. Isaksen, R. Brækkan, and R. Leeper. The quantification and correction of wind-induced precipitation measurement errors. Hydrology and Earth System Sciences, 21(4):1973–1989, 2017.
  • Krainski et al. (2017) E. Krainski, F. Lindgren, D. Simpson, and H. Rue. The R-INLA tutorial on SPDE models, 2017. URL http://www.r-inla.org/examples/tutorials/spde-tutorial.
  • Laaha and Blöschl (2005) G. Laaha and G. Blöschl. Low flow estimates from short stream flow records—a comparison of methods. Journal of Hydrology, 306(1):264 – 286, 2005.
  • Lindgren et al. (2011) F. Lindgren, H. Rue, and J. Lindström. An explicit link between Gaussian fields and Gaussian markov random fields: the stochastic partial differential equation approach. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73:423–498, 2011.
  • (18) T.A McMahon, G. Laaha, J. Parajka, M.C Peel, H.H.G. Savenije, M. Sivapalan, J. Szolgay, S.E Thompson, A. Viglione, R.A Woods, and D. Yang. Runoff Prediction in Ungauged Basins: Synthesis across Processes, Places and Scales. Camebridge University press.
  • Merz and Blöschl (2005) R. Merz and G. Blöschl. Flood frequency regionalisation - spatial proximity vs. catchment attributes. Journal of Hydrology, 302:283 – 306, 2005.
  • Moraga et al. (2017) P. Moraga, S. M. Cramb, K. L. Mengersen, and M. Pagano. A geostatistical model for combined analysis of point-level and area-level data using INLA and SPDE. Spatial Statistics, 21:27 – 41, 2017.
  • (21) Q. Mu, F.A Heinsch, M. Zhao, and S.W Running. Remote Sensing of Environment, 111:519–536.
  • Neff (1977) E. L. Neff. How much rain does a rain gage gage? Journal of Hydrology, 35:213–220, November 1977.
  • Reitan and Petersen-Øverleir (2009) T. Reitan and A. Petersen-Øverleir. Bayesian methods for estimating multi-segment discharge rating curves. Stochastic Environmental Research and Risk Assessment, 23:627–642, 2009.
  • Rue and Held (2005) H. Rue and L. Held. Gaussian Markov Random Fields: Theory and Applications, volume 104 of Monographs on Statistics and Applied Probability. Chapman & Hall, London, 2005.
  • Rue et al. (2009) H. Rue, S. Martino, and N. Chopin. Approximate Bayesian inference for latent Gaussian models using integrated nested Laplace approximations, volume 71. 2009.
  • Sauquet et al. (2000) E. Sauquet, L. Gottschalk, and E. Lebois. Mapping average annual runoff: A hierarchical approach applying a stochastic interpolation scheme. Hydrological Sciences Journal, 45(6):799–815, 2000.
  • Simpson et al. (2017) D. Simpson, H. Rue, A. Riebler, T. G. Martins, and S. H. Sørbye. Penalising Model Component Complexity: A Principled, Practical Approach to Constructing Priors. Statistical Science, 32:1–28, 2017.
  • Skøien et al. (2006) J. O. Skøien, R. Merz, and G. Blöschl. Top-kriging - geostatistics on stream networks. Hydrology and Earth System Sciences Discussions, 10:277–287, 2006.
  • Skøien et al. (2003a) J. O. Skøien, G. Blöschl, and A. W. Western. Characteristic space scales and timescales in hydrology. Water Resources Research, 39, 2003a.
  • Skøien et al. (2003b) J. O. Skøien, G. Blöschl, and A. W. Western. Characteristic space scales and timescales in hydrology. Water Resources Research, 39(10), 2003b.
  • Stohl et al. (2008) A. Stohl, C. Forster, and H. Sodemann. Remote sources of water vapor forming precipitation on the Norwegian west coast at 60 °N - a tale of hurricanes and an atmospheric river. Journal of Geophysical Research: Atmospheres, 113(D5), 2008.
  • Viglione et al. (2013) A. Viglione, J. Parajka, M. Rogger, J L. Salinas, G. Laaha, M. Sivapalan, and G Blöschl. Comparative assessment of predictions in ungauged basins - Part 3: Runoff signatures in Austria. Hydrology and Earth System Sciences Discussions, 10:449–485, 2013.
  • Wang et al. (2018) C. Wang, M.A. Puhan, and R. Furrer. Generalized spatial fusion model framework for joint analysis of point and areal data. Spatial Statistics, 23:72 – 90, 2018.
  • Whittle (1954) P. Whittle. On stationary processes in the plane. Biometrika, 41:434–49, 1954.
  • Whittle (1963) P. Whittle. Stochastic processes in several dimensions. Bulletin of the International Statistical Institute, 40:974–994, 1963.
  • WMO (1992) WMO. International meteorological vocabulary. 1992.
  • Wolff et al. (2015) M.A Wolff, A. Petersen-Øverleir, K. Ødemark, T. Reitan, and R. Brækkan. Derivation of a new continuous adjustment function for correcting wind-induced loss of solid precipitation: Results of a Norwegian field study. Hydrology and Earth System Sciences, 19, 2015.
  • Zhang et al. (2009) K. Zhang, J. S. Kimball, Q. Mu, L. A. Jones, S. J. Goetz, and S. W. Running. Satellite based analysis of northern ET trends and associated changes in the regional water balance from 1983 to 2005. Journal of Hydrology, 379:92 – 110, 2009.
  • Zhang et al. (2010) K. Zhang, J. S. Kimball, R. R. Nemani, and S. W. Running. A continuous satellite-derived global record of land surface evapotranspiration from 1983 to 2006. Water Resources Research, 46(9), 2010.
  • Ødegård (2017) J. Ødegård. Spatial non-stationary models for precipitation with elevation in the dependency structure - a case study of annual precipitation in hordaland. Master thesis, Norwegian University of Science and Technology, 2017.