Estimation and prediction of Gaussian processes using generalized Cauchy covariance model under fixed domain asymptotics

We study estimation and prediction of Gaussian processes with covariance model belonging to the generalized Cauchy (GC) family, under fixed domain asymptotics. Gaussian processes with this kind of covariance function provide separate characterization of fractal dimension and long range dependence, an appealing feature in many physical, biological or geological systems. The results of the paper are classified into three parts. In the firs part, we characterize the equivalence of two Gaussian measures with GC covariance function. Then we provide sufficient conditions for the equivalence of two Gaussian measures with Matérn (MT) and GC covariance functions and two Gaussian measures with Generalized Wendland (GW) and GC covariance functions. In the second part, we establish strong consistency and asymptotic distribution of the maximum likelihood estimator of the microergodic parameter associated to GC covariance model, under fixed domain asymptotics. The third part study optimal prediction with GC model and specifically, we give conditions for asymptotic efficiency prediction and asymptotically correct estimation of mean square error using a misspecified GC, MT or GW model, under fixed domain asymptotics. Our findings are illustrated through a simulation study: the first compares the finite sample behavior of the maximum likelihood estimation of the microergodic parameter of the GC model with the given asymptotic distribution. We then compare the finite-sample behavior of the prediction and its associated mean square error when the true model is GC and the prediction is performed using the true model and a misspecified GW model.

Authors

• 4 publications
• 3 publications
09/15/2020

10/05/2013

Moments and Root-Mean-Square Error of the Bayesian MMSE Estimator of Classification Error in the Gaussian Model

The most important aspect of any classifier is its error rate, because t...
11/11/2019

Efficiency Assessment of Approximated Spatial Predictions for Large Datasets

Due to the well-known computational showstopper of the exact Maximum Lik...
11/14/2019

Kriging: Beyond Matérn

The Matérn covariance function is a popular choice for prediction in spa...
07/29/2020

Asymptotically Equivalent Prediction in Multivariate Geostatistics

Cokriging is the common method of spatial interpolation (best linear unb...
06/03/2015

Optimal change point detection in Gaussian processes

We study the problem of detecting a change in the mean of one-dimensiona...
01/01/2022

Fitting Matérn Smoothness Parameters Using Automatic Differentiation

The Matérn covariance function is ubiquitous in the application of Gauss...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Two fundamental steps in geostatistical analysis are estimating the parameters of a Gaussian stochastic process and predicting the process at new locations. In both steps, the covariance function covers a central aspect. For instance, mean square error optimal prediction at an unobserved site depends on the knowledge of the covariance function. Since a covariance function must be positive definite, practical estimation generally requires the selection of some parametric families of covariances and the corresponding estimation of these parameters.

The maximum likelihood (ML) estimation method is generally considered the best method for estimating the parameters of covariance models. Nevertheless, the study of the asymptotics properties of ML estimation, is complicated by the fact that more than one asymptotic frameworks can be considered when observing a single realization (Zhang and Zimmerman, 2005). The increasing domain asymptotic framework corresponds to the case where the sampling domain increases with the number of observed data and where the distance between any two sampling locations is bounded away from 0. The fixed domain asymptotic framework, sometimes called infill asymptotics (Cressie, 1993), corresponds to the case where more and more data are observed in some fixed bounded sampling domain.

General results for the asymptotics properties of th ML estimator, under increasing domain asymptotic framework and some mild regularity conditions, are given in Mardia and Marshall (1984) and Bachoc (2014). Specifically they show that ML estimates are consistent and asymptotically Gaussian with asymptotic covariance matrix equal to the inverse of the Fisher information matrix.

Under fixed domain asymptotics, no general results are available for the asymptotic properties of ML estimation. Yet, some results have been obtained when assuming the covariance belongs to MT (Matérn, 1960) or GW (Gneiting, 2002) models. Both families allow for a continuous parameterization of smoothess of the underlying Gaussian process the GW family being additionally compactly supported (Bevilacqua et al., 2017). Specifically, when the smoothness parameter is known and fixed, not all parameters can be estimated consistently, when , with

the dimension of the Euclidean space. Instead, the ratio of variance and scale (to the power of a function of the smoothing parameter), sometimes called microergodic parameter is consistently estimable. This follows from results given in

Zhang (2004) for the MT model and Bevilacqua et al. (2017) for the GW model.

Asymptotic results for ML estimation of the microergodic parameter of the MT model can be found in Zhang (2004), Du et al. (2009), Wang and Loh (2011) when the scale parameter is assumed known and fixed. Kaufman and Shaby (2013) give strong consistency and asymptotic distribution of the microergodic parameter when estimating jointly the scale and the variance parameters and by means of a simulation study they show that the asymptotic approximation is considerably improved in this case, even for large sample size. Similar results for the microergodic parameter of the GW model can be found in Bevilacqua et al. (2017).

In terms of prediction, under fixed domain asymptotic, Stein (1988, 1990) provides conditions under which optimal predictions under a misspecified covariance function are asymptotically efficient, and mean square errors converge almost surely to their targets. Stein’s conditions translates into the fact that the true and the misspecified covariances must be compatible, that is the induced Gaussian measures are equivalent (Skorokhod and Yadrenko, 1973; Ibragimov and Rozanov, 1978). A weaker condition, based on ratio of spectral densities, is given in Stein (1993).

In this paper we study ML estimation and prediction of Gaussian processes, under fixed domain asymptotics, using GC covariance model. GC family of covariance models has been proposed in Gneiting and Schlather (2004) and deeply studied in Lim and Teo (2009). It is particularly attractive because Gaussian processes with such covariance function allow for any combination of fractal dimension and Hurst coefficient, an appealing feature in many physical, biological or geological systems (see (Gneiting et al., 2012) and Gneiting and Schlather (2004) and the references therein).

In particular, we offer the following results. First, we characterize the equivalence of two Gaussian measures with covariance functions belonging to the GC family and sharing the same smoothness parameter. A consequence of this result is that, as in MT and GW covariance models, when the smoothness parameter is known and fixed, not all parameters can be estimated consistently, under fixed domain asymptotics. Then we give sufficient conditions for the equivalence of two Gaussian measure where the state of truth is represented by a member of the MT or GC family and the other Gaussian measure has a GC covariance model.

We then assess the asymptotic properties of the ML estimator of the microergodic parameter associated to the GC family. Specifically, for a fixed smoothness parameter, we establish strong consistency and asymptotic distribution of the microergodic parameter assuming the scale parameter fixed and known. Then, we generalize these results when jointly estimating with ML the variance and the scale parameter.

Finally, using results in Stein (1988) and Stein (1993), we study the implications of our results on prediction, under fixed domain asymptotics. One remarkable implication is that when the true covariance belongs to the GC family, asymptotic efficiency prediction and asymptotically correct estimation of mean square error can be achieved using a compatible compactly supported GW covariance model.

The remainder of the paper is organized as follows. In Section 2 we review some results about MT, GW and GC covariance models. In Section 3 we first characterize the equivalence of Gaussian measure under the GC covariance model. Then we give sufficient conditions for the equivalence of two Gaussian measures with MT and GC and two Gaussian measures with GW and GC covariance models. In Section 4 we establish strong consistency and asymptotic distribution of the ML estimation of the microergodic parameter of the GC models, under fixed domain asymptotics. Section 5 discuss the consequences of our results in terms of prediction, under fixed domain asymptotics. Section 6 provides two simulation studies: the first show how well the given asymptotic distribution of the microergodic parameter apply to finite sample cases, when estimating with ML a GC covariance model under fixed domain asymptotics. The second compare the finite-sample behavior of the prediction when using two compatible GC and GW models, when the true model is GC. The final Section provides a discussion on the consequence of our results and open problems for future research.

2 Matérn, Generalized Wendland and Generalized Cauchy covariance models

This section depicts the main features of the three covariance models involved in the paper. We denote a zero mean Gaussian stochastic process on a bounded set of , with stationary covariance function . We consider the family of continuous mappings with , such that

 cov(Z(s),Z(s′))=C(s′−s)=ϕ(∥s′−s∥),

with , and denoting the Euclidean norm. Gaussian processes with such covariance functions are called weakly stationary and isotropic.

Schoenberg (1938) characterized the family

as being scale mixtures of the characteristic functions of random vectors uniformly distributed on the spherical shell of

, with any probability measure,

:

 ϕ(r)=∫∞0Ωd(rξ)F(dξ),r≥0,

with and is Bessel function of the first kind of order . The family is nested, with the inclusion relation being strict, and where is the family of mappings whose radial version is positive definite on any -dimensional Euclidean space.

The MT function, defined as:

 Mν,α,σ2(r)=σ221−νΓ(ν)(rα)νKν(rα),r≥0,

is a member of the family for any positive values of and . Here, is a modified Bessel function of the second kind of order , is the variance and a positive scaling parameter.

We also define as the family that consists of members of being additionally compactly supported on a given interval, , . Clearly, their radial versions are compactly supported over balls of with radius . The GW correlation function is defined as (Bevilacqua et al., 2017; Gneiting, 2002):

 φμ,κ,β,σ2(r)={σ2B(2κ,μ+1)∫1r/βu(u2−(r/β)2)κ−1(1−u)μdu,0≤r/β<1,0,r/β≥1, (2.1)

where denotes the beta function, is the variance and is the compact support. Equivalent representations of (2.1) in terms of Gauss hypergeometric function or Legendre polynomials are given in Hubbert (2012). Closed form solutions of integral (2.1) can be obtained when with , the so called original Wendland functions (Wendland, 1995), and, using some results in Schaback (2011), when , the so called missing Wendland functions.

Arguments in Gneiting (2002) and Zastavnyi (2006) show that, for a given , if and only if . Note that is not defined because must be strictly positive. In this special case we consider the Askey function (Askey, 1973)

 Aμ(r)=(1−r)μ+={(1−r)μ,0≤r<1,0,r≥1,

where denotes the positive part. Arguments in (Golubov, 1981) show that if and only if and, following Bevilacqua et al. (2017), we define .

The parameters and are crucial for the differentiability at the origin and, as a consequence, for the degree of the differentiability of the associated sample paths in the MT and GW models. In particular for a positive integer , the sample paths of a Gaussian process are times differentiable if and only if in the MT case and if and only if in the GW case.

The smoothness of a Gaussian process can also be described via the Hausdorff or fractal dimension of a sample path. The fractal dimension is a measure of the roughness for non-differentiable Gaussian processes and higher values indicating rougher surfaces. For a given covariance function if as for some then the sample paths of the associated random process have fractal dimension . Here is the so called fractal index that governs the roughness of sample paths of a stochastic process.

In the case of a MT model so if and otherwise (Adler, 1981; Gneiting et al., 2012). Thus the MT model permit the full range of allowable values for the fractal dimension. In the case of GW family , so that in this case if and otherwise. Thus the GW model does not allow to cover the full range of allowable values for the fractal dimension.

Long-memory dependence can be defined trough the asymptotic behavior of the covariance function at infinity. Specifically, for a given covariance function , if the power-law as holds for some the stochastic process is said to have long memory with Hurst coefficient . MT and GW covariance models does not posses this feature.

A celebrated family of members of is the GC class (Gneiting and Schlather, 2004), defined as:

 Cδ,λ,γ,σ2(r)=σ2(1+(r/γ)δ)−λ/δ,r≥0, (2.2)

where the conditions and are necessary and sufficient for . The parameter is crucial for the differentiability at the origin and, as a consequence, for the degree of the differentiability of the associated sample paths. Specifically, for , they are infinitely times differentiable and they are not differentiable for .

The GC family represents a breaking point with respect to earlier literature based on the assumption of self similarity, since it decouples the fractal dimension and the Hurst effect. Specifically, the sample paths of the associated stochastic process have fractal dimension for and if it has long memory with Hurst coefficient . Thus, and may vary independently of each other (Gneiting and Schlather, 2004; Lim and Teo, 2009).

Fourier transforms of radial versions of members of , for a given , have a simple expression, as reported in Stein (1999) and Yaglom (1987). For a member of the family , we define its isotropic spectral density as

 ˆϕ(z)=z1−d/2(2π)d∫∞0ud/2Jd/2−1(uz)ϕ(u)du,z≥0, (2.3)

and through the paper we use the notation , and for the spectral density associated to , and . A well-known result about the spectral density of the Matérn model is the following:

 ˆMν,α,σ2(z)=Γ(ν+d/2)πd/2Γ(ν)σ2αd(1+α2z2)ν+d/2,z≥0. (2.4)

Define the function as:

 (1F2(a;b,c;z)=∞∑k=0(a)kzk(b)k(c)kk!,z∈R,

which is a special case of the generalized hypergeometric functions (Abramowitz and Stegun, 1970), with for , being the Pochhammer symbol. The spectral density of for are given in Bevilacqua et al. (2017). For instance, if , then

 ˆφμ,κ,β,σ2(z)=σ2Lβd(1F2(λ;λ+μ2,λ+μ2+12;−(zβ)24),z≥0

where , and .

For two given functions and , with we mean that there exist two constants and such that and for each . The next result follows from Lim and Teo (2009) and describe the spectral density of the GC covariance function and its asymptotic behaviour.

Theorem 1.

Let be the function defined at Equation (2.2). Then, for and :

1.  ˆCδ,λ,γ,σ2(z)=−σ2γd/2+1z−d2d/2−1πd/2+1Im∫∞0K(d−2)/2(γt)(1+exp(iπδ2)(t/z)δ)λ/δtd/2dt,z≥0.
2.  ˆCδ,λ,γ,σ2(z)=ϱz−(d+δ)−O(z−(d+2δ))   for z→∞,
3.  ˆCδ,λ,γ,σ2(z)≍z−(d+δ)   for z→∞,

where .

3 Equivalence of Gaussian measures with Generalized Cauchy, Matérn and Generalized Wendland covariance models

Equivalence and orthogonality of probability measures are useful tools when assessing the asymptotic properties of both prediction and estimation for stochastic processes. Denote with , , two probability measures defined on the same measurable space . and are called equivalent (denoted ) if for any implies and vice versa. On the other hand, and are orthogonal (denoted ) if there exists an event such that but . For a stochastic process , to define previous concepts, we restrict the event to the -algebra generated by where . We emphasize this restriction by saying that the two measures are equivalent on the paths of .

Gaussian measures are completely characterized by their mean and covariance function. We write for a Gaussian measure with zero mean and covariance function . It is well known that two Gaussian measures are either equivalent or orthogonal on the paths of (Ibragimov and Rozanov, 1978).

Let , be two zero mean Gaussian measures with isotropic covariance function and associated spectral density , , as defined through (2.3). Using results in Skorokhod and Yadrenko (1973) and Ibragimov and Rozanov (1978), Stein (2004) has shown that, if for some , is bounded away from 0 and as , and for some finite and positive ,

 ∫∞czd−1{ˆρ1(z)−ˆρ0(z)ˆρ0(z)}2dz<∞, (3.1)

then for any bounded subset , on the paths of . For the reminder of the paper, we denote with , , a zero mean Gaussian measure induced by a MT, GW and GC covariance function respectively. The following Theorem is due to Zhang (2004). It characterize the compatibility of two MT covariance models sharing a common smoothness parameter .

Theorem 2.

For a given , let , , be two zero mean Gaussian measures. For any bounded infinite set , , on the paths of , if and only if

 σ20α2ν0=σ21α2ν1. (3.2)

The following Theorem is a generalization of Theorem 4 in Bevilacqua et al. (2017) and it characterize the compatibility of two GW covariance models sharing a common smoothness parameter . We omit the proof since the result can be obtained using the same arguments.

Theorem 3.

For a given , let , , be two zero mean Gaussian measures and let . For any bounded infinite set , , on the paths of if and only if

 σ20β2κ+10μ0=σ21β2κ+11μ1. (3.3)

The first relevant result of this paper concerns the characterization of the compatibility of two GC functions sharing a common smoothness parameter.

Theorem 4.

For a given , let be two zero mean Gaussian measures. For any bounded infinite set , , on the paths of if and only if

 σ20γδ0λ0=σ21γδ1λ1. (3.4)
Proof.

Let us start with the sufficient part of the assertion. From Theorem 1 point 3, we know that is bounded away from 0 and as . In order to prove the sufficient part, we need to find conditions such that for some positive and finite ,

 ∫∞czd−1⎛⎜⎝ˆCδ,λ1,γ1,σ21(z)−ˆCδ,λ0,γ0,σ20(z)ˆCδ,λ0,γ0,σ20(z)⎞⎟⎠2dz<∞ (3.5)

We proceed by direct construction, and, using Theorem 1 Point 2 we find that as ,

 ∣∣ˆCδ,λ1,γ1,σ21(z)−ˆCδ,λ0,γ0,σ20(z)ˆCδ,λ0,γ0,σ20(z)∣∣ ≤zd+δ∣∣ϱ1z−(d+δ)−O(z−(d+2δ))−ϱ0z−(d+δ)+O(z−(d+2δ))∣∣ ≤∣∣ϱ1−ϱ0+O(z−δ)∣∣

where , with .

Then we obtain,

 ∫∞czd−1⎛⎜⎝ˆCδ,λ1,γ1,σ21(z)−ˆCδ,λ0,γ0,σ20(z)ˆCδ,λ0,γ0,σ20(z)⎞⎟⎠2dz ≤∫∞czd−1(ϱ1−ϱ0+O(z−δ))2dz

We conclude that (3.5) is true if and . This last condition implies (3.4). Moreover since , the condition can be satisfied only for The sufficient part of our claim is thus proved. The necessary part follows the arguments in the proof of Zhang (2004).

An immediate consequence of Theorem 4 is that, for a fixed , the parameters , and cannot be estimated consistently. Nevertheless the microergodic parameter is consistently estimable. In Section 4, we establish the asymptotic properties of ML estimation associated to the microergodic parameter of the GC model.

The second relevant result of this paper give sufficient conditions for the compatibility of a GC and a MT covariance model.

Theorem 5.

For given , let and be two zero mean Gaussian measures. If and

 σ20α2ν=(Γ2(δ/2)sin(πδ/2)21−δπ)σ21γδ1λ1, (3.6)

then for any bounded infinite set , , on the paths of ,

Proof.

The spectral density of the MT model is given by:

 ˆMν,α,σ20(z)=Γ(ν+d/2)πd/2Γ(ν)σ2αd(1+α2z2)ν+d/2,z≥0. (3.7)

It is known that is bounded away from 0 and as for some (Zhang, 2004). In order to prove the sufficient part we need to find conditions such that for some positive and finite ,

 ∫∞czd−1(ˆCδ,λ1,γ1,σ21(z)−ˆMν,α,σ20(z)ˆMν,α,σ20(z))2dz<∞. (3.8)

Let . Using asymptotic expansion of  (3.7) and Theorem 1, point 2, we have that as ,

 ∣∣ˆCδ,λ1,γ1,σ21(z)−ˆMσ20,α,ν(z)ˆMσ20,α,ν(z)∣∣ =∣∣ϱ−12[ϱ1z−(d+δ)−O(z−(d+2δ))](α−2+z2)ν+d2−1∣∣ =∣∣ϱ−12[ϱ1z−(d+δ)−O(z−(d+2δ))]z2ν+d((αz)−2+1)ν+d2−1∣∣ =∣∣ϱ−12[ϱ1z−(d+δ)−O(z−(d+2δ))]z2ν+d[1+(ν+d/2)(αz)−2 +O(z−2)]−1∣∣ =∣∣ϱ−12ϱ1z2ν−δ−1+ϱ−12ϱ1(ν+d/2)α−2z2ν−δ−2+O(z2ν−δ−2) −O(z2ν−2δ)−O(z2ν−2δ−2)∣∣ ≤∣∣ϱ−12ϱ1z2ν−δ−1∣∣+ϱ−12ϱ1(ν+d/2)α−2z2ν−δ−2+O(z2ν−2δ) +O(z2ν−2δ−2)+O(z2ν−δ−2).

Then, if and we obtain,

 ∫∞czd−1∣∣ˆCδ,λ1,γ1,σ21(z)−ˆMσ20,α,ν(z)ˆMσ20,α,ν(z)∣∣2dz ≤∫∞czd−1((ν+d/2)α−2z−2+O(z−δ))2dz

and the second term of the inequality is finite for . Moreover since , the condition can be satisfied only for Then for a given , , inequality  (3.8) is true if and . This last two conditions implies (3.6).

Remark I: As expected, compatibility between GC and MT covariance models is achieved only for a subset of the parametric space of that leads to non differentiable sample paths and in particular for , .

The following are sufficient conditions given in Bevilacqua et al. (2017) concerning the compatibility of a MT and a GW covariance models.

Theorem 6.

For given and , let and be two zero mean Gaussian measures. If , , and

 σ20α2ν=(Γ(2κ+μ+1)Γ(μ+1))σ21β2κ+1μ, (3.9)

then for any bounded infinite set , , on the paths of .

Putting together Theorem 5 and Theorem 6 we obtain the next new result that establish sufficient conditions for the compatibility of a GW and GC covariance function:

Theorem 7.

For given let and be two zero mean Gaussian measures. If , and

 (Γ(2κ+μ+1)Γ(μ+1))σ21β2κ+1μ=(Γ2(δ/2)sin(πδ/2)21−δπ)σ20γδλ, (3.10)

then for any bounded infinite set , , on the paths of .

Remark II: As expected, compatibility between GC and GW covariance models is achieved only for a subset of the parametric space of that leads to non differentiable sample paths and in particular , and , .

4 Asymptotic properties of the ML estimation for the Generalized Cauchy model

We now focus on the microergodic parameter associated to the GC family. The following results fix the asymptotic properties of its ML estimator. In particular, we shall show that the microergodic parameter can be estimated consistently, and then assess the asymptotic distribution of the ML estimator.

Let be a bounded subset of and denote any set of distinct locations. Let be a finite realization of , , a zero mean stationary Gaussian process with a given parametric covariance function , with , a parameter vector and a member of the family , with .

We then write for the associated correlation matrix. The Gaussian log-likelihood function is defined as:

 Ln(σ2,τ)=−12(nlog(2πσ2)+log(|Rn(τ)|)+1σ2Z′nRn(τ)−1Zn). (4.1)

Under the GC model, the Gaussian log-likelihood is obtained with and . Since in what follows and are assumed known and fixed, for notation convenience, we write . Let and be the maximum likelihood estimator obtained maximizing for fixed and .

In order to prove consistency and asymptotic Gaussianity of the microergodic parameter, we first consider an estimator that maximizes (4.1) with respect to for a fixed arbitrary scale parameter , obtaining the following estimator

 ^σ2n(γ)=argmaxσ2Ln(σ2,γ)=Z′nRn(γ)−1Zn/n. (4.2)

Here is the correlation matrix coming from the GC family . The following result offers some asymptotic properties of ML estimator of the migroergodic parameter both in terms of consistency and asymptotic distribution. The proof is omitted since it follows the same steps in Bevilacqua et al. (2017) and Wang and Loh (2011).

Theorem 8.

Let , , be a zero mean Gaussian process with covariance function belonging to the GC family, i.e. , with and , . Suppose . For a fixed , let as defined through Equation (4.2). Then, as ,

1. and

2. .

The second type of estimation considers the joint maximization of (4.1) with respect to where and . The solution of this optimization problem is given by where

 ^σ2n(^γn)=Z′nRn(^γn)−1Zn/n

and . Here is the profile log-likelihood:

 PLn(γ)=−12(log(2π)+nlog(^σ2n(γ))+log|Rn(γ)|+n). (4.3)

We now establish the asymptotic properties of the sequence of random variables

in a special case. The following two Lemmas are needed in order to establish consistency and asymptotic distribution.

Lemma 1.

For any , if then is a non-decreasing function of .

Proof of Lemma 1.

Using Theorem 1 point 1:

 γδˆCσ2,λ,δ,γ(z)=−σ2γδ+λz−d/2+12d/2−1πd/2+1Im∫∞0K(d−2)/2(zt)(γδ+exp(iπδ2)(t)δ)λ/δtd/2dt,

and if , we obtain:

 γδˆCσ2,δ,δ,γ(z)=σ2z−d/2+12d/2−1πd/2+1∫∞0K(d−2)/2(zt)td/2+δγ2δsin(πδ/2)|γδ+exp(i