Recently, a considerable interest has been paid on statistical inference of spatial regression models for geostatistical data analysis in some economic and scientific fields such as spatial econometrics, ecology, and seismology. In this paper, we consider the following spatial regression model.
where and is a scalar for each ,
are independent and identically distributed (i.i.d.) random variables, andare observations. The functions and are mean and conditional variance functions, respectively. Lu and Tjøstheim (2014)
proposed a scheme of domain-expanding and infill (DEI) asymptotics and studied nonaprametric density estimation for spatial data under this sampling scheme. In this paper, we are extending their results to spatial regression models in a relatively simple setting under their sampling scheme. As mentioned in their paper, in many applications, the DEI sampling scheme is natural because of physical constraints measurement stations cannot usually be put on a regular grid in space. Precisely, Letbe the Euclidean norm on and define
Then we assume that as . There are some papers whose sampling schemes are related to the DEI asymptotics. Hall and Patil (1994)
investigated nonparametric estimation of spatial covariance function based on observations generated by a probability distribution.Matsuda and Yajima (2009) work with a similar sampling scheme and focus on the nonparametric and parametric estimation of the spectral density.
Recent contributions in the literature of statistical inference on spatial (or random field) models include Jenish and Prucha (2009, 2012), Machkouri (2011), and Machkouri et al. (2013) which investigate limit theorems for the statistical inference on spatial process observed on (irregularly spaced) lattice. In the literature on semi-parametric spatial (or spatio-temporal) regression models, we mention Gao, Lu and Tjøstheim (2006), Lu et al. (2009), Yan et al. (2014) and Al-Sulami et al. (2017) as recent key references. We can also find an overview for recent developments on semi-parametric spatial models in Robinson (2008). Robinson (2011) and Jenish (2012) study nonparametric inference on spatial regression models under a dependence structure which is different from our mixing-type conditions. They derive central limit theorems of mean functions and the former discusses an application of the results to spatial data observed on lattice. We also refer to Hallin et al. (2004), Machkouri and Stoica (2008), Hallin et al. (2009), Li (2016) and Machkouri et al. (2017) which study nonparametric inference and estimation on mean function of spatial regression models based on random fields on lattice. However, those papers do not derive limit theorems of variance functions.
The goal of this paper is to derive multivariate central limit theorems of the mean and variance function of the model (1.1). From a technical point of view, we cannot use small-block and large-block argument due to Bernstein (1926) which is a well known tool for nonparametric inference for regularly observed time series data since we work with DEI sampling scheme. Although we can apply the blocking argument for regularly spaced data, there is no practical guidance for constructing small and large blocks under the asymptotic framework that the distance between observations goes to . Therefore, to avoid the problem, we use another approach due to Bolthausen (1982)
which is based on the convergence of characteristic functions of the estimators. As a result, this paper contributes to the literature on nonparametric inference for spatial regression models, and to the best of our knowledge, this is the first paper to derive limit theorems for the mean and variance functions of the model (1.1) under the DEI asymptotics.
The rest of the paper is organized as follows. In Section 2, we give conditions to derive limit theorems given in this paper. In Section 3, we give multivariate central limit theorems of the mean and variance functions and propose a method to construct confidence bands for the estimators of those functions on finite intervals included in . We also propose a data-driven method for bandwidth selection and report simulation results to study finite sample performance of the central limit theorems and proposed confidence bands in Section 4. All proofs are collected in Appendix A.
For any non-empty set and any (complex-valued) function on , let , and for , let for .
In this section we summarize assumptions used in the proof of limit theorems given in Section 3 for the sake of convenience.
(A1): Assumption on spatial process
is a strictly stationary spatial process, satisfying the -mixing property that there exist a function such that as , and a function symmetric and increasing in each of its two arguments, such that
where , be the Borel -field generated by , and for each and .
For some constant and some ,
as , where .
(A2): Assumption on bandwidths and sampling scheme
, , and , .
(A3): Assumption on kernel function
The kernel function is bounded, symmetric, and has a bounded support. Let .
(A4): Assumption on regression models
Let , , , and be density functions of , , and with pairwise distinct , respectively. For some and , denotes an -enlargement of , that is, .
and for some compact set and some and .
, , and , , , and are bounded uniformly with respect to pairwise distinct where .
Here, for , is the set of functions having bounded derivatives on up to order , and is the set of continuous functions on .
, , and . Here, is the constant which appear in Assumption (A1) (ii).
and are independent.
The random field is called strongly mixing if the condition (2.1) holds with . The same or similar conditions are used in Hallin et al. (2004), and Lu and Tjøstheim (2014). The condition can be seen as an extension of strong mixing conditions for continuous-time stochastic processes and time series models. It is known that many stochastic processes and (nonlinear) time series models are strongly mixing. The conditions (2.1) and (iii) in Assumption (A1) are satisfied by many spatial processes. We refer to Rosenblatt (1985) and Guyon (1987) for detailed discussion on strong mixing conditions for random fields.
3. Multivariate central limit theorems of the mean and variance functions
In this section we give central limit theorem of the marginal density, mean, and variance functions of the nonparametric spatial regression model (1.1).
3.1. Limit theorems for mean and variance functions
Let be a kernel function with , and are bandwidth with and as . We estimate and by
The estimator is a jackknife version of . Although, from a theoretical point of view, it is sufficient to assume in Assumption (A4) (ii) to derive limit theorems on , we use this estimator instead of in the definition of to ignore the effects of its asymptotic bias and for the improvement of its finite sample performance. In fact, under Assumption (A3) and (A4) (four-times continuous differentiability of on ), we can show that and for each .
Now we give limit theorems for and . First we give a multivariate extension of Theorem 1 in Lu and Tjøstheim (2014).
Under Assumptions (A1), (A2), (A3) and (A4), for , we have that
where is the identity matrix.
Next we give a general limit theorem for nonparametric spatial regression models. The following theorem are used to prove multivariate central limit theorems of and .
Assume that and are functions such that
for some .
with , and .
Then, under Assumptions (A1), (A2), (A3) and (A4), for we have that
as , where .
By using Proposition 3.2, we can finally derive the following two theorems.
Under Assumptions (A1), (A2), (A3) and (A4), for , we have that
Under Assumptions (A1), (A2), (A3) and (A4), and , for , we have that
The mixing condition could be relaxed to a more general near epoch dependence (at least for random fields observed at (irregularly spaced) lattice points). For example, we refer toLu and Linton (2007) and Li, Lu and Linton(2012). They study nonparametric inference and estimation of time series regression models respectively. However, their analysis is based on regularly spaced time series and the assumption is essential in those papers. Recently, there are new techniques for Gaussian approximation of time series based on strong approximation (e.g. Liu and Li (2009)
) or combinations of Slepian’s smart path interpolation (Röllin (2011)) to the solution of Stein’ s partial deferential equation, Stein’s leave-one(-block)-out method (Stein (1986)) and other analytical techniques (e.g. see Zhang and Wu (2017) and Zhang and Cheng (2017) under different physical dependence, and Chernozhukov, Chetverikov and Kato (2013) under -mixing sequences). If we work with domain-expanding but not infill asymptotics ( and ), we would able to relax our mixing condition and also provide a method to construct (asymptotically) uniform confidence bands as results of high-dimensional extensions of our theorems to the case that number of design points in an interval increases as the sample size goes to infinity (i.e. , as ) by using techniques in those papers (see Kurisu (2018) for time series case). However, if we work with DEI asymptotics, to achieve such results would need careful treatment of the dependence among observations and the author believes that it requires substantial work.
3.2. Confidence bands for mean and variance functions
Based on the multivariate central limit theorems in the previous section, we propose a method to construct confidence bands for mean and variance functions on a finite interval included in . We estimate by
instead of the naive estimator to improve finite sample performance. Let be i.i.d. standard normal random variables, and let satisfy for . Then, are joint asymptotic
% confidence intervals of, , and when , and , respectively.
In this section we present simulation results to see the finite-sample performance of the central limit theorems and proposed confidence bands in Section 3.
4.1. Simulation framework
To generate the locations irregularly positioned in , first we set a lattice with and for where and . Next we select locations randomly from the lattice as the irregular locations with , and set . As a data generating process, we consider the following spatial moving average process.
where are independent and identically distributed standard normal random variables, is the -component of the matrix
We also consider and as the mean and variance functions respectively, and use i.i.d. standard Gaussian random variables as noise variables . In our simulation study, we use the Epanechnikov kernel and set the sample size as 750. Note that Assumptions (A1) on the spatial process is satisfied from the definition of the spatial moving average process (4.1).
4.2. Bandwidth selection
Now we discuss bandwidth selection for the construction of confidence bands on a finite interval . Let be a finite interval, be design points with for , and let and be grids of bandwidths. We use a data-driven method which is similar to that proposed in Kurisu (2018)
. From a theoretical point of view, we have to choose bandwidths that are of smaller order than the optimal rate for estimation under the loss functions (or a “discretized version” of-distances) and for our confidence bands to work. At the same time, choosing a too small bandwidth results in a too wide confidence band. Therefore, we should choose a bandwidth “slightly” smaller than the optimal one that minimizes those loss functions. We employ the following rule for bandwidth selection of . We also choose a bandwidth of in a similar manner.
Set a pilot bandwidth and make a list of candidate bandwidths for .
Choose the smallest bandwidth such that the adjacent value is smaller than for some .
For the bandwidth selection of , we first choose a bandwidth () of by the proposed rule. Then plug into and we finally choose a bandwidth by the proposed rule. In our simulation study, we choose , and . This rule would choose a bandwidth slightly smaller than a bandwidth which is intuitively the optimal bandwidth for the estimation of and as long as the threshold value is reasonably chosen.
Figure 1 depicts five realizations of the loss functions (left) and (right) with different bandwidth values and , . Figure 2 depicts five realizations of the loss function (left) and (right). It is observed that the shape of partly mimics that of . The same thing can be said about . By using the proposed rule and the visual information of Figure 1, we set to draw Figures 4 and 5.
In practice, it is also recommended to make use of visual information on how and behave as increases when determining the bandwidth.
(right). The red line is the density of a standard normal distribution. The number of Monte Carlo iteration is 250 for each case. From these figures we can find the central limit theorems for each design point (Theorems3.1 and 3.2) hold true.
Figure 6 shows (dark gray), (gray), and (light gray) confidence bands of (left) and (right) on . We set , as design points.
We also investigated finite sample properties of and for different sample sizes. Figure 7 depicts five realizations of the loss functions (black line) and (red line) with different bandwidth values. We set (left), (center) and (right), and , . We can find that red lines tend to be located below black lines. This implies that tends to have small bias compared with as the sample size increases.
This work is partially supported by Grant-in-Aid for Research Activity Start-up (18H05679) from the JSPS.
Appendix A Proofs
Proof of Proposition 3.1.
We already have pointwise central limit theorem of by Theorem 1 in Lu and Tjøstheim (2014). Then a multivariate central limit theorem also holds true by Cramér-Wald device. Therefore, it suffices to show the asymptotic independence of estimators at different design points. For this, it is sufficient to show that covariances between estimators at different design points are asymptotically negligible. For , we have that
For , by the assumption on and a change of variables, we have that
as . We can also show that as by the same argument of the proof of Theorem 1 in Lu and Tjøstheim (2014). Therefore, we complete the proof. ∎
Proof of Proposition 3.2.
Let . For the proof of the central limit theorem, it is sufficient to show the following conditions by Lemma 2 in Bolthausen (1982).
and (b) for any .
The condition (a) immediately follows from the definition of , since by a change of variables and the dominated convergence theorem, we have that
To show the condition (b), we use some technique in Lu and Tjøstheim (2014).
where , , and . We first give a lemma used in the proof of the condition (b).
We have that
Proof of Lemma a.1.
We show the case when in the first result. The other case can be shown in the same way. If , as , we have that
Now we show the second result. If and , by Proposition 2.5 in Fan and Yao (2003), we have that
Therefore, we complete the proof. ∎
We decompose as follows.