Stochastic Local Interaction Model with Sparse Precision Matrix for Space-Time Interpolation

02/20/2019 ∙ by Dionissios T. Hristopulos, et al. ∙ Technical University of Crete 0

The application of geostatistical and machine learning methods based on Gaussian processes to big space-time data is beset by the requirement for storing and numerically inverting large and dense covariance matrices. Computationally efficient representations of space-time correlations can be constructed using local models of conditional dependence which can reduce the computational load. We formulate a stochastic local interaction () model for regular and scattered space-time data that incorporates interactions within controlled space-time neighborhoods. The strength of the interaction and the size of the neighborhood are defined by means of kernel functions and respective local bandwidths. Compactly supported kernels lead to finite-size local neighborhoods and consequently to sparse precision matrices that admit explicit expression. Hence, the model's requirements for storage are modest and the costly covariance matrix inversion is not needed. We also derive a semi-explicit prediction equation and express the conditional variance of the prediction in terms of the diagonal of the precision matrix. For data on regular space-time lattices, the model is equivalent to a Gaussian Markov Random Field.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 12

page 18

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Space-time (ST) data are becoming available in overwhelming volumes and diverse forms due to the continuing growth of remote-sensing capabilities, the deployment of low-cost, ground-based sensor networks, as well as the increasing usage of sensors based on unmanned aerial vehicles, and crowdsourcing [1]. The ongoing data explosion has an impact in various fields of science and engineering. The modeling and processing of massive ST datasets poses conceptual, methodological, and technical challenges. Sufficiently flexible and computationally powerful solutions are not widely available to date, because most existing methods are not designed for global, high-volume, hyper-dimensional, heterogeneous and uncertain ST data. For example, classical geostatistical and machine learning methods [2, 3] are limited by the cubic dependence of the computational time on data size, which is prohibitive even for large purely spatial data.

The modeling and processing of ST data require more advanced methods and computational resources than those that are adequate for purely spatial data. For example, theories that simply extend spatial statistics by adding a separable time dimension are often inadequate for capturing realistic correlations and for analyzing massive ST data [4, 5]. Various methods have been proposed for developing non-separable covariance models [6, 7, 8]. Current methods, whether they are based on geostatistics [2], spatio-temporal statistics [9, 5], or machine learning [3] face serious scalability problems. A prevailing obstacle in the processing chain is the computationally demanding iterated inversion of large covariance (Gram) matrices [3, 10]. Hence, classical methods executed on standard desktop computers are limited to datasets with size

. Various approaches for alleviating the dimensionality problem (covariance tapering, composite likelihood, low-rank computations, stochastic partial differential equation representation, etc.) have been proposed and developed 

[10].

In the case of continuum random fields, Gaussian field theories of statistical physics provide models with local structure which is derived from the derivatives of the field [11]. Gaussian Markov random fields (GMRFs) also share the local property, since they are defined in terms of interactions that involve local neighborhoods [12]. Stochastic local interaction (SLI) models are inspired from GMRFs and Gaussian field theory. They are based on the idea that correlations are generated by interactions between neighboring sites and times. The interactions are incorporated in a precision matrix with simple parametric dependence.

We present a theoretical framework for the analysis of ST data that is based on stochastic local interaction (SLI) models [13, 14]. This formulation is useful for filling gaps by interpolation in ST datasets of environmental importance. For example, gaps in records of meteorological variables need to be reconstructed for the evaluation of renewable energy potential at candidate sites [15], while ground-based rainfall gauge networks often have missing data [16]. The main idea in SLI is that the ST correlations are determined by means of sparse precision matrices that only involve couplings between near neighbors (in the ST domain). In contrast with GMRF models that are typically defined on regular lattice data and field theories which are defined on continuum spaces, the SLI framework is suitable for direct application to scattered data and stochastic graph processes. However, it is also applicable to data on regular space-time lattices.

The remainder of this manuscript is structured as follows: In Section 2 we define the ST-SLI model and discuss its properties. In Section 3 we formulate ST prediction based on the SLI model, and in Section 4

we discuss parameter estimation from

ST data. Following the theoretical formulation, Section 5 presents an application of the SLI method to three datasets which involve simulated ST data, reanalysis temperature data, and atmospheric ozone measurements. Finally, we present our conclusions and a brief discussion in Section 6.

2 St Model based on Stochastic Local Interactions

A space-time scalar random field (STRF) where and

is defined as a mapping from the probability space

into the space of real numbers . For each ST coordinate , is a measurable function of , where is the state index [4]. The states (realizations) of the random field are real-valued functions obtained for a specific . In the following, the state index is dropped to simplify notation.

We focus on partially sampled realizations of the random field, where

is the sample size. The vector

comprises the field values at the ST point set . The point set is assumed to be quite general; it may represent a time sequence of lattice sites, randomly scattered points in space and time, or a collection of time series at random locations in space.

2.1 Energy of the exponential joint density

The SLI model is based on a joint pdf defined by the Boltzmann-Gibbsexponential distribution

(1)

where is an energy function that represents the “cost” of a specific configuration, is a vector of model parameters, and is the normalizing factor known as partition function.

We assume that satisfies the following properties for any vector and :

  1. Gaussianity: is a quadratic function of the data vector that can be expressed as

    (2)

    where is a vector of mean values and is the precision (or interaction) matrix that depends on the parameter vector . The vector incorporates trend functions and periodical variations.

  2. Positive-definiteness: for all that are not identically equal to zero. This is equivalent to the precision matrix being a positive-definite matrix.

  3. Sparseness: is a sparse matrix that incorporates the local interactions.

More specifically, we focus on the following SLI energy function which satisfies the properties of Gaussianity, positive-definiteness and sparseness:

(3)

Equation  (3) assumes for simplicity a uniform mean . The term represents a weighted average of the squared increments. However, instead of focusing on all pairs, the average defined below selects only pairs within a local neighborhood around each point .

The SLI parameter vector includes (a uniform mean value), (an overall scale parameter proportional to the variance), and (dimensionless factor that multiplies the contribution from the squares of the increments). The vector includes additional parameters that determine the local ST neighborhoods used in the average of the squared increments .

2.2 Kernel-based averaging

The weights in the average of the squared increments are defined by means of the Nadaraya-Watson equation [17, 18], i.e.,

(4)

The coefficients are defined in terms of compactly supported ST kernel functions , where for an ST kernel, for a spatial kernel, and for a temporal kernel. Kernel functions are symmetric, real-valued functions; herein they are assumed to take values in the interval without loss of generality. Moreover, we will assume spatially homogeneous and temporally stationary kernel functions, i.e., , , and . Furthermore, it will be assumed for simplicity that the kernel function depends only on the magnitude of the ST distance.

2.3 Definition of space-time distance

The space-time distance used in the kernel weights determines the structure of correlations that we impose in the space-time domain. Both separable and non-separable space-time metric distances are possible as discussed below.

Composite space-time distance: In this case the spatial and temporal coordinates are intertwined in the distance metric. For example, the differential of the space-time distance between two points using the Riemannian metric is

(5)

where

are the elements of the metric tensor

, , are the differentials of the spatial distance in the orthogonal directions, and is the time differential [19, 20].

In the Euclidean case the metric tensor , is given by

(6)

where is a parameter that controls the contribution of the time lag in the composite distance. The kernel coefficient based on the composite Euclidean metric can be expressed as

(7)

The space-time distance for a composite metric leads to an ellipsoidal neighborhood shown in the schematic of Fig. 0(a).

Separable space-time distance: The coefficients for a separable space-time neighborhood are defined as

(8)

In (8) is the spatial lag between the initial point and the target point , and is the local spatial bandwidth at . In addition, is the temporal lag between the initial and target times, and is the temporal bandwidth. The space-time distance for a separable space-time metric leads to cylindrical neighborhoods as shown in Fig. 0(b).

2.4 Definition of bandwidths

For each ST point , the spatial bandwidth is determined from the geometry of the sampling network around the spatial point , while the temporal bandwidth is based on the time neighborhood around . In general, this means that the number of bandwidth parameters scales linearly with the sampling size, leading to an under-determined estimation problem when the additional parameters are accounted for.

To simplify the bandwidth estimation we use a trick that reduces the dimensionality of the problem. We assign to each point a bandwidth which is proportional to the spatial distance between this point and its -nearest neighbor in the point set . Thus, it holds that , where typically , and is a dimensionless spatial bandwidth parameter to be estimated from the data.

In the case of the composite space-time distance the temporal bandwidths are determined from the and the additional parameter .

For a separable ST distance metric, the temporal bandwidths are determined by means of , where is the order of the temporal neighbor and is a dimensionless temporal bandwidth parameter. This definition of the temporal bandwidth in the case of uniform time step implies uniform bandwidths for all except the initial and final times, where the bandwidth is automatically increased to account for the missing left and right neighbors respectively.

(a) Composite
(b) Separable
Figure 1: Schematics of kernel-based neighborhoods for composite (left) and separable (right) space-time structures.

2.5 Properties of kernel weights

The kernel-average of the squared increments (4) can be expressed in terms of normalized weights as follows

(9a)
(9b)

Normalization: The definition (9b) of the kernel weights implies that

Asymmetry: The definition (9b) of the bandwidths is based on the local ST neighborhood. This implies that the spatial weights are in general asymmetric, i.e., if , since the sampling density around the point can be quite different than around the point .

Non-separability: The kernel weights are non-separable for both the composite and the separable ST distance metrics. In the first case this is obvious from the definition (7). In the second case, even though the are separable, the normalized weights are non-separable functions of space and time due to the kernel summation in the denominator of (4).

Robustness with respect to general distance metrics: Regardless of the distance metric used, the kernel-based weights are non-negative. This implies that the SLI energy function (3) is positive, and consequently the precision matrix is positive definite. Hence, general distance metrics, e.g., Manhattan (also known as city block and taxicab) distance, can be used in the SLI model.

In the following we develop the SLI formalism for a separable space-time metric structure.

2.6 Squared increments for separable space-time metric

In this section we formulate the average squared increments for separable space-time kernel functions using matrix operations.

First, we define the square kernel matrices of dimension and of dimension as follows

(10a)
(10b)

Then, the matrix of ST kernel weights is given by the following Kronecker product (denoted by ):

(10c)

For compactly supported kernel functions the matrix given by (10c) is sparse.

The matrix of the normalized kernel weights is then defined by means of

(11a)
where the denominator represents the entry-wise norm of the matrix and is given by
(11b)

In terms of the above matrices, the average squared increment (9) is expressed as follows

(12)

where is the vector of ones, and denotes the Hadamard product, i.e., .

The computational complexity of the operations in (12) is , if the sparsity of the matrix is not taken into account. However, the numerical complexity can be improved using sparse-matrix operations.

2.7 Precision matrix formulation

The SLI energy function (3) can be transformed into a quadratic energy functional, i.e., of the form of equation (2), as follows

(13)

where is the identity matrix: if and otherwise. The matrix is derived from the average squared increments (9), and . It is thus expressed in terms of the normalized weights according to

(14)

where the normalized weights are given by (9b). Hence, the precision matrix is determined by the sampling pattern, the kernel functions, and the bandwidths.

3 St Prediction

In this section we consider ST prediction by means of the SLI model at the set of space time points , where , assuming that the model parameters are known. It is further assumed that the sets and are disjoint. For example, the set could comprise all the nodes of a regular map grid at a time instant for which measurements are not available. Alternatively, could comprise all the nodes of an irregular spatial sampling network at a time instant with no measurements.

3.1 Sli energy function including prediction set

The SLI energy function that incorporates the prediction sites is given by straightforward extension of (3). Thus, the following expression that involves block vectors of sampling and prediction sites and respective precision block matrices is obtained

(15)

where is the detrended data vector, is the fluctuation vector at the prediction points, and is the estimate of the parameter vector based on the data. Let the sets denote either of the disjoint sets or . Then, the block precision matrices are expressed as

(16)

The block sub-matrices are defined as follows:

(17a)
(17b)
(17c)
(17d)
(17e)
(17f)

3.2 Prediction based on stationary point of the energy

The Boltzmann-Gibbs pdf of the field at the prediction sites conditional on the data is given by . The prediction maximizes the pdf, which is equivalent to minimizing the energy, i.e.,

(18)

The SLI energy (15) can be further expressed in terms of the precision matrix as follows

where depends only on the data and is thus irrelevant for the prediction. The condition for a stationary point of the energy function is

(19)

The Hessian of the energy is , where the prime denotes differentiation with respect to . For the stationary point to represent a minimum of the energy (and thus a maximum of the Boltzmann-Gibbs pdf), must be positive definite. From (19) it follows that . Since the SLI precision matrix is positive definite by construction, so is the Hessian as well.

Finally, the SLI prediction is given by the following equation

(20)

where is the diagonal unit matrix and the precision matrices and are defined by means of (16) and (17c)-(17f).

Note that due to the matrix product and in light of (16) the SLI prediction is independent of the scale parameter . This property is analogous to the independence of the kriging prediction on the variance, since the scale parameter sets the magnitude of the variance.

3.3 Prediction intervals

Since the precision matrix of the SLI model is known, it is straightforward to obtain the conditional variance at the prediction sites using the result known in Markov random field theory [12]. Hence,

(21)

where is the -th diagonal entry of the precision matrix which is determined from (16) and (17f).

Based on the above, prediction intervals at the site can be constructed as follows

where is a specified level (e.g., ), and

is the respective quantile of the standard normal distribution.

4 Parameter Estimation

For the estimation of the SLI model parameters we use leave-one-out (LOO) cross validation. The SLI parameter vector is defined as follows

The orders of the spatial and temporal neighbors and are set in advance to a low integer value larger than one. This does not have a serious impact on the results, since the bandwidth parameters compensate for the choice of the neighbor order.

The SLI model parameters that are estimated by LOO cross validation involve the reduced parameter vector

(22)

The cost function optimized with respect to the parameters is the root mean square error (RMSE) of the predictions. The latter are based on the SLI prediction equation (20); for each sampling point the prediction is based on the remaining points. Hence, the RMSE is given by

(23)

where denotes the data vector from which the point is removed. Note that the dependence of the cost function on the bandwidth controlling parameters and is highly nonlinear, due to the presence of the latter in kernel function sums both in the numerator and the denominator of the normalized kernel weights (11a).

The optimal parameter vector is then determined by means of

Finally, the value of the scale parameter is selected to set equal to the value that maximizes the likelihood of the SLI model given the other parameters, i.e., [13]

(24)

where

5 Application to Data

We investigate SLI-based interpolation for synthetic (simulated) data, reanalysis ST data (temperature in degrees Celsius), and ozone measurements over France. The data are used to provide proof of concept for the ST-SLI method.

5.1 Synthetic data

We generate an ST realization from a stationary random field with mean and variance using the R package “RandomFields” [21, 22]. The random field has an exponential covariance model with correlation lengths and in space and time respectively. The realization is sampled at random locations over a spatial domain at times, i.e., , for where . The resulting time series at 25 spatial locations are shown in Fig. 1(a), while the spatial configurations for the first 16 time slices are shown in Fig. 1(b).

(a) Time series
(b) Time slices
Figure 2: Time series of synthetic data (left) at 25 locations and spatial linear-interpolation maps for the first 16 time slices (right) . The open circles represent the sampling locations. Each time slice corresponds to one simulated day.

5.1.1 Sli model estimation

The orders of the spatial and temporal neighbors are set to . The initial values for the reduced parameter set are given by

(25)

The optimization is conducted in the Matlab programming environment using the constrained optimization function fmincon, which allows constraining the SLI parameters to positive values. The optimization parameters include a maximum of iterations and function evaluations, as well as tolerance equal to for the cost function and changes in the optimization variables. The optimization is run using the default interior-point method and terminates after 20 iterations at a local minimum. The value of the cost function (root mean square error) for the optimal SLI parameters is . The values of the optimal SLI parameters are listed in Table 1. The RMSE is , a value higher than the “optimal” value of obtained in parameter estimation. This is not surprising since parameter estimation was based on LOO cross validation, while the interpolation performance test involves removing an entire slice, i.e., points at a time.

The logarithmic absolute values of the precision matrix elements, evaluated with the optimal SLI parameters, are shown in Fig. 3

for a fraction of the domain. The non-zero matrix entries are shown as colored dots. The four diagonal bands (two above and two below the main diagonal) comprise nearest and next-nearest temporal neighbors. The sparsity of the precision matrix is evident in the plot; the sparsity index is

corresponding to 39 132 non-zero elements.

Figure 3: Logarithm of the absolute value of the SLI precision matrix for the synthetic ST data.
Initial values 10 300 3 2.5 N/A
Based on LOO CV 9.74 0.5009 1.012 430.02
Table 1: SLI parameters for the synthetic ST data based on leave-one-out cross validation. The scale parameter is determined from (24). The value of the cost functional (RMSE) at the optimum is equal to . An initial value for is not needed.

5.1.2 Sli model performance

To test the performance of the estimated SLI model we remove and subsequently predict all the values for one time slice using the sample values at the remaining time slices. We repeat this experiment by removing sequentially all the time slices, one at a time. The scatter plot of the predictions (for all points) versus the sample values is shown in Fig. 3(a) and exhibits overall good agreement between the two sets. The histogram plots of the predicted versus the sample values, shown in Fig. 3(b), demonstrate that the SLI predictions tend to cluster around the center of the distribution more than the sample values.

(a) Scatter plot
(b) Histogram
Figure 4: Left: scatter plot of the predictions versus the sample values for the synthetic space-time data. Right: histograms of the sample (yellow) and SLI-predicted (grey) values.

Various cross validation measures are presented in Table 2. ME stands for the mean error (bias), MAE is the mean absolute error, MARE is the mean absolute relative error (), RMSE is the root mean square error, RMSRE is the root mean square relative error (), is the linear correlation coefficient () and the Spearman correlation coefficient (also ). The validation measures indicate overall good performance of the SLI model with small bias and very good correlation . The RMSE is , a value higher than the “optimal” value of obtained during parameter estimation. This is not surprising since parameter estimation was based on LOO cross validation, while the present test involves removing an entire slice, i.e., points at a time.

ME MAE MARE RMSE RMSRE
0.0126 0.6383 0.0721 0.8011 0.1046 0.9387 0.9373
Table 2: Cross validation (CV) interpolation performance for the synthetic data. The CV measures are calculated by comparing the true values of each time slice (from 1 to 50) and the predicted values using the SLI model. The SLI predictions are based on time slices that exclude the predicted slice.

5.2 Reanalysis data

We use ERA5 reanalysis temperature data (degrees Celsius) downloaded from the Copernicus Climate Change Service [23]. The dataset includes 39 000 points that correspond to hourly values for five consecutive days(January 1-5, 2017) at the nodes of a spatial grid around the island of Crete (Greece) as shown in Fig 4(a). The average spatial resolution is degrees (grid cell size km). The data are displayed as time series in Fig. 4(b).

(a) ERA5 sites around Crete
(b) Temperature
Figure 5: Left: Spatial grid for the ERA5 temperature data (degrees Celsius) around the island of Crete (Greece). Right: Time series of the temperature at the ERA5 grid sites involved in the study.

The SLI parameter estimation and the performance assessment are carried out as in the example with the synthetic data. The parameter estimates are shown in Table 3. The precision matrix has a sparsity index , corresponding to non-zero entries out of entries. The estimated is very close to its initial value, which indicates that the optimization routine may be trapped in a local optimum. Nonetheless, this does not seem to severely impact the performance of the model as discussed below.

The scatter plot of the predictions (for all points) is shown in Fig. 5(a) and exhibits good agreement between the predictions and the data. The histogram plots of the predicted versus the sample values (Fig. 5(b)) also show that SLI predictions have lower dispersion than the sample values, as was the case for the synthetic data. The cross validation measures (obtained by sequentially removing each of the 120 hourly time slices) are shown in Table 4 and confirm the interpolation performance for the SLI model.

Initial values 13.5846 300 3 2.5 N/A
Based on LOO CV 13.5846 300.0009 0.6667 0.7119 4.448
Table 3: SLI parameters for the ERA5 temperature data based on leave-one-out cross validation. The scale parameter is determined from (24). The value of the cost functional (RMSE) at the optimum is equal to . An initial value for is not needed.
ME MAE MARE RMSE RMSRE
0.0003 0.097 0.0081 0.1660 0.0205 0.9971 0.9960
Table 4: Cross validation (CV) interpolation performance for the ERA5 temperature data. The CV measures are calculated by comparing the true temperature values of each hourly time slice (from 1 to 120) and the SLI predictions. The predictions are based on time slices that exclude the predicted slice.
(a) Scatter plot
(b) Histogram
Figure 6: Left: scatter plot of the predictions versus the sample values for the ERA5 temperature data. Right: histograms of the sample (yellow) and SLI-predicted (grey) values.

5.3 Ozone data

This dataset includes ozone (O3) hourly concentration data (measured in ) for five consecutive days (July 1-5, 2014), downloaded from the French GEOD’AIR database. The data are collected at 335 scattered stations distributed around France. The time series at 107 stations that reported data at all times are shown in Fig. 6(a). Linear interpolation maps for the first 16 time slices are shown in Fig. 6(b).

(a) Time series
(b) Time slices
Figure 7: Left: Time series of ozone O3 hourly data () at 107 French stations and spatial linear-interpolation maps for the first 16 time slices. Right: Linear interpolation maps of ozone concentration for different time slices (hours of the day). The red line indicates the convex hull of the station network.

The SLI parameter estimation and performance assessment are conducted as explained above. The parameter estimates are shown in Table 5. MARE and RMSRE are infinite because the dataset includes zero values. The precision matrix has a sparsity index , i.e., it includes non-zero entries out of entries.

The scatter plot of the predictions (for all points) versus the sample values is shown in Fig. 7(a) and exhibits overall good agreement between the data and the predictions. The histogram plots of the predicted versus the sample values, shown in Fig. 7(b) also show that SLI predictions have lower dispersion than the sample values as in the case of synthetic data. The cross validation performance measures (obtained by sequentially removing each of the 120 hourly time slices) are shown in Table 6, and they demonstrate very good interpolation performance for the SLI model. Note that , which implies that the spatial bandwidth is very small. On the other hand, and imply that the temporal bandwidth is where hour; for the first and last time slices the bandwidth is . These results imply that the SLI model predictions for this dataset are essentially based on nearest-neighbor (forward and backward) temporal correlations.

Initial values 63.2961 5 3 2.5 N/A
Based on LOO CV 62.6511 622.2286 0.6604 0.0257 2.5293
Table 5: SLI parameters for the French hourly ozone concentration data () based on leave-one-out cross validation. The scale parameter is determined from (24). The value of the cost functional (RMSE) at the optimum is equal to . An initial value for is not necessary.
ME MAE MARE RMSE RMSRE
0.0091 3.9316 Inf 5.5799 Inf 0.9841 0.9837
Table 6: Cross validation (CV) interpolation performance measures for the ozone concentration values. The CV measures compare the true ozone concentration values () of each hourly time slice (from 1 to 120) with the SLI predictions. The latter are based on time slices that exclude the predicted slice.
(a) Scatter plot
(b) Histogram
Figure 8: Left: scatter plot of the predictions versus the sample values for the French hourly ozone data. Right: histograms of the sample (yellow) and SLI-predicted (grey) values.

6 Discussion and Conclusions

We present a theoretical framework for constructing ST

models based on exponential Boltzmann-Gibbs joint probability density functions. The

ST-SLI model presented herein exploits an energy function with local interactions which imposes sparse structure on the precision matrix. The local interactions are implemented by means of compactly supported kernel functions that compensate for the lack of a structured lattice. However, the model is also applicable to regular lattice data. In this case the SLI model is equivalent to a Gauss Markov random field with a specific precision matrix structure. To our knowledge, this structure that involves kernel-matrix weights has not been used before in models of real-valued ST data. The ST-SLI model extends the purely spatial SLI model [13] to the space-time domain. The SLI approach shares the reliance on kernel functions with kernel-based reconstruction of graph signals [24]. In the case of environmental ST data, the graph topology is not given a priori, but it is determined from the data by optimizing a specified cost function; herein we use the RMSE of the SLI predictions in the leave-one-out cross validation framework.

The SLI model presented features a Gaussian energy functional with a sparse precision matrix. Explicit expressions are given for ST prediction and the conditional variance at the prediction sites. The sparse precision matrix representation allows computationally efficient implementation of parameter estimation and prediction procedures. The computational efficiency stems from the fact that in SLI it is not necessary to store and invert large and dense covariance matrices.

We based the SLI parameter estimation on the leave-one-out cross validation (LOO-CV) approach by taking advantage of the semi-explicit prediction equation (20). The root mean square error of the prediction was used as the LOO-CV cost function. The optimization was based on the interior-point algorithm which terminates at local minima of the cost function. The landscape of the cost function should be further investigated in order to understand the patterns of local minima. It is also possible to run a global optimization algorithm to search for the global optimum of the cost function. On the other hand, experience with the purely spatial SLI model [13] shows that local minima of the cost function provide parameter estimates that are sufficient for interpolation purposes.

The formulation presented herein can be extended to multivariate random fields by judicial selection of the energy function. In addition, it is possible to include anisotropic and more general (e.g., geodesic) spatial distance metrics in the kernel functions, periodic patterns (in space and in time) by adding shifted averaged squared increments, and spatial dependence of the coefficients and . Such extensions will enhance the flexibility of the SLI model at the cost of some loss in computational efficiency.

The SLI model was applied herein without attempting to model temporal or spatial trend functions or periodic variations. For example, the ozone data exhibit a cyclical behavior in the time domain which was not explicitly modeled. The temperature reanalysis data also exhibit an increasing trend over the time period that was studied. Such features can definitely be taken into account and are expected to further improve the performance of the model. The simplest approach is to subtract trends and periodic functions from the data and apply the SLI model to the residuals.

Acknowledgments

This work was funded by the Operational Program “Competitiveness, Entrepreneurship and Innovation 2014-2020” (co-funded by the European Regional Development Fund) and managed by the General Secretariat of Research and Technology, Ministry of Education, Research and Religious Affairs under the project DES2iRES (T3EPA-00017) of the ERAnet, ERANETMED_NEXUS-14-049. This support is gratefully acknowledged.

We also thank Prof. Valerie Monbet (Université de Rennes) for suggesting and helping with the acquisition of the ERA5 reanalysis data. Finally, we thank Dr. Denis Allard (INRA) for bringing to our attention the French ozone data collected by the Laboratoire Central de Surveillance de la Qualité de l’Air.

References

References

  • [1] National Research Council et al. Frontiers in Massive Data Analysis. National Academies Press, Washington, DC, 2013.
  • [2] J. P. Chilès and P. Delfiner. Geostatistics: Modeling Spatial Uncertainty. Wiley, New York, 2nd edition, 2012.
  • [3] C. E. Rasmussen and C. K. I. Williams. Gaussian Processes for Machine Learning. MIT Press, Massachusetts Institute of Technology, 2006.
  • [4] G. Christakos. Random Field Models in Earth Sciences. Academic Press, San Diego, 1992.
  • [5] N. Cressie and C. L. Wikle. Statistics for Spatio-temporal Data. John Wiley and Sons, New York, 2011.
  • [6] S. De Iaco, D. E. Myers, and D Posa. Nonseparable space-time covariance models: some parametric families. Mathematical Geology, 34(1):23–42, 2002.
  • [7] A. Kolovos, G. Christakos, D.T. Hristopulos, and M. L. Serre. Methods for generating non-separable spatiotemporal covariance models with potential environmental applications. Advances in Water Resources, 27(8):815–830, 2004.
  • [8] E. A. Varouchakis and D. T. Hristopulos. Comparison of spatiotemporal variogram functions based on a sparse dataset of groundwater level variations. Spatial Statistics, 2017.
  • [9] T. Gneiting, M. G. Genton, and P. Guttorp. Geostatistical space-time models, stationarity, separability, and full symmetry. In B. Finkelstádt, L. Held, and V. Isham, editors, Statistical Methods for Spatio-Temporal Systems, volume 107, pages 151–175. Chapman & Hall, 2006.
  • [10] Y. Sun, B. Li, and M. G. Genton. Geostatistics for large datasets. In E. Porcu, J.–M. Montero, and M. Schlather, editors, Advances and Challenges in Space-time Modelling of Natural Events, Lecture Notes in Statistics, pages 55–77. Springer Berlin Heidelberg, 2012.
  • [11] G. Mussardo. Statistical Field Theory. Oxford University Press, Oxford, 2010.
  • [12] H. Rue and L. Held. Gaussian Markov Random Fields: Theory and Applications. Chapman and Hall/CRC, Boca Raton, FL, 2005.
  • [13] D. T. Hristopulos. Covariance functions motivated by spatial random field models with local interactions. Stochastic Environmental Research and Risk Assessment, 29:739––754, 2015.
  • [14] D. T. Hristopulos and I. C. Tsantili. Space–time covariance functions based on linear response theory and the turning bands method. Spatial Statistics, 22:321–337, 2017.
  • [15] E. Koutroulis and D. Kolokotsa. Design optimization of desalination systems power-supplied by PV and W/G energy sources. Desalination, 258(1-3):171–181, 2010.
  • [16] A. Bárdossy and G. Pegram. Infilling missing precipitation records–a comparison of a new copula-based method with other techniques. Journal of hydrology, 519:1162–1170, 2014.
  • [17] E. A. Nadaraya. On estimating regression. Theory of Probability and its Applications, 9(1):141–142, 1964.
  • [18] G. S. Watson.

    Smooth regression analysis.

    Sankhya Ser. A, 26(1):359–372, 1964.
  • [19] G. Christakos, D. T. Hristopulos, and P. Bogaert. On the physical geometry concept at the basis of space/time geostatistical hydrology. Advances in Water Resources, 23(8):799–810, 2000.
  • [20] G. Christakos. Spatiotemporal Random Fields: Theory and Applications. Elsevier, Amsterdam, Netherlands, 2017.
  • [21] M. Schlather, A. Malinowski, M. Oesting, D. Boecker, K. Strokorb, S. Engelke, J. Martini, F. Ballani, O. Moreva, J. Auel, P. J. Menck, S. Gross, U. Ober, C. Berreth, K. Burmeister, J. Manitz, P. Ribeiro, R. Singleton, B. Pfaff, and R Core Team. RandomFields: Simulation and Analysis of Random Fields, 2017. R package version 3.1.50.
  • [22] M. Schlather, A. Malinowski, P. J. Menck, M. Oesting, and K. Strokorb. Analysis, simulation and prediction of multivariate random fields with package RandomFields. Journal of Statistical Software, 63(8):1–25, 2015.
  • [23] Copernicus Climate Change Service C3S. ERA5: fifth generation of ECMWF atmospheric reanalyses of the global climate, 2018. Data retrieved from: https://cds.climate.copernicus.eu/cdsapp#!/home.
  • [24] D. Romero, M. Ma, and G. B. Giannakis. Kernel-based reconstruction of graph signals. IEEE Transactions on Signal Processing, 65(3):764–778, 2017.