1 Introduction
Spatial point processes are models for sets of random locations of possibly interacting objects. Background on spatial point processes can be found in Møller and Waagepetersen (2004), Illian et al. (2008) or Baddeley et al. (2015) which gives both an accessible introduction as well as details on implementation in the R package spatstat
. Moments of counts of objects for spatial point processes are typically expressed in terms of so-called joint intensity functions or Papangelou conditional intensity functions which are defined via the Campbell or Georgii-Nguyen-Zessin equations (see the aforementioned references or the concise review of intensity functions and Campbell formulae in Section
2). In this paper we consider a third type of equation called variational equations.A key feature of variational equations compared to Campbell and Georgii-Nguyen-Zessin equations is that they are formulated in terms of the gradient of the log intensity or conditional intensity function rather than the (conditional) intensity itself. Variational equations were introduced for parameter estimation in Markov random fields by Almeida et al. (1993)
. The authors suggested the terminology ‘variational’ due to the analogy between the derivation of their estimating equation and the variational Euler-Lagrange equations in partial differential equations. The resulting equation consisted in an equilibrium equation involving the gradient of the log conditional probability of the Markov random field. Later,
Baddeley and Dereudre (2013) obtained variational equations for Gibbs point processes and exploited them to infer a log-linear parametric model of the conditional intensity function. Coeurjolly and Møller (2014) established a first-order variational equation for general spatial point processes and used it to estimate parameters in a log-linear parametric model for the intensity function.The first contribution of this paper is to establish second-order variational equations. The second-order properties of a spatial point process are characterized by the so-called pair correlation function which is a normalized version of the second-order joint intensity function. We assume that the pair correlation function is translation invariant and also consider the case when it is isotropic. Since the new variational equations are based on the gradient of the log pair correlation function, they take a particularly simple form for pair correlation functions of log-linear form.
Our second contribution is to propose a new non-parametric estimator of the pair correlation function. The classical approach is to use a kernel estimator, see e.g. Møller and Waagepetersen (2004). More recently, Jalilian et al. (2019) investigated the estimation of the pair correlation function using an orthogonal series expansion. In the setting of their simulation studies, the orthogonal series estimator was shown to be more efficient than the standard kernel estimator. One drawback, however, is that the orthogonal series estimator is not guaranteed to be non-negative. We therefore propose to use our second-order variational equation to estimate coefficients in an orthogonal series expansion of the log pair correlation function. This ensures that the resulting pair correlation function estimator is non-negative. We compare our new estimator with the previous ones in a simulation study and also illustrate its use on real datasets.
2 Background and main results
2.1 Spatial point processes
Throughout this paper we let be a spatial point process defined on . That is, is a random subset of with the property that the intersection of with any bounded subset of is of finite cardinality. The joint intensity functions , , are characterized (when they exist) by the Campbell formulae (equations) (see e.g. Møller and Waagepetersen, 2004): for any (with the non-negative real numbers)
(1) |
More intuitively, for any pairwise distinct points , is the probability that for each , has a point in an infinitesimally small region around with volume . The intensity function corresponds to the case , i.e. . The pair correlation function is obtained by normalizing the second-order joint intensity :
(2) |
for pairwise distinct and where is set to 0 if or is zero. Intuitively, [] means that presence of a point at increases [decreases] the probability of observing a further point at and vice versa. We assume that is observed on some bounded domain with volume and without loss of generality we assume that for all (otherwise we just replace by provided the latter set has positive volume).
We will always assume that is second-order intensity reweighted stationary (Baddeley et al., 2000), meaning that its pair correlation function is invariant by translations. We then, with an abuse of notation, write for for any . We will also consider the case of an isotropic pair correlation function in which case depends only on the distance .
For the presentation of the second-order variational type equation in the next section some additional notation is needed. For a function which is differentiable on , we denote by
the gradient vector with respect to the
coordinates. The inner product is denoted by a ‘’ and for , a multivariate function such that each component is differentiable on , we define the divergence operator by2.2 Second-order variational equations
In this section, we present our new second-order variational equations. The proofs of the results are given in the Appendices.
Theorem 1.
Assume is second-order intensity reweighted stationary. Let be a componentwise continuously differentiable function on . Assume that is continuously differentiable on , that , and that there exists a sequence of increasing bounded domains such that as , with piecewise smooth boundary and such that
(3) |
where stands for the outer normal measure to . Then
(4) |
where denotes the function for any and where denotes the domain translated by .
We note that condition (3) is in particular satisfied if the function is compactly supported.
We next consider the case where the pair correlation function is isotropic, i.e. for any there exists such that .
Theorem 2.
Assume is second-order intensity reweighted stationary with isotropic pair correlation function . Let be continuously differentiable on . Assume that is continuously differentiable on and that either
(5) |
or
(6) |
Then we have the two following cases. If (5) is assumed,
(7) |
where for any . Instead, if (6) is assumed,
(8) |
2.3 Sensitivity matrix
In the next section we use empirical versions of (7) and (8) to construct estimating functions for a parametric model of an isotropic pair correlation function depending on a -dimensional parameter , . We here investigate the expression for the associated sensitivity matrices.
Consider functions all fulfilling (5) and possibly depending on . By stacking the equations obtained by applying these functions for in (7) we obtain the estimating function
(9) |
where and are vector functions with components and . The sensitivity matrix is obtained as the expectation of the negated derivative (with respect to ) of (9). After applying (7) once again after differentiation we obtain the sensitivity matrix
Applying the Campbell theorem and converting to polar coordinates, we obtain
where is the surface area of the -dimensional unit ball. In case of (8) we obtain a similar expression,
By choosing for some real function , becomes at least positive semi-definite.
3 Estimation of log linear pair correlation function
We now consider the estimation of an isotropic pair correlation function of the form
(10) |
where the functions , are known. Following Section 2.3, the idea is to apply Theorem 2 times to functions , , of the form where the function will be justified and specified later. It is then remarkable that we obtain a simple estimating equation of the form . The sensitivity matrix discussed in Section 2.3 is . Provided is invertible we obtain the explicit solution
(11) |
The matrix and the vector are specified in the following corollary.
Corollary 1.
Let . Assume that and () are respectively continuously differentiable and twice continuously differentiable on . Assume either that
(12) |
or
(13) |
If (12) is assumed, we define the matrix and the vector by
(14) | ||||
(15) |
where again the edge effect factor is for any . Instead, in case of (13), we define
(16) | ||||
(17) |
Then, the equation
(18) |
is an unbiased estimating equation.
Proof.
The proof consists in applying Theorem 2 with for and in noticing that . ∎
We note that if is compactly supported in , then (12) or (13) are always valid assumptions. Another special case is also interesting: let and , then (13) is true if for any , and . This simple condition is for instance satisfied if the ’s’ are exponential covariance functions.
The results above are e.g. applicable to the case of a pair correlation function for a log Gaussian Cox process with covariance function given by a sum of known correlation functions scaled by unknown variance parameters. Assuming a known correlation function is on the other hand quite restrictive. However, any log pair correlation function can be approximated well on a finite interval using a suitable basis function expansion so that we can effectively represent it as a log linear model. We exploit this in Section
4 where we consider the case where the functions are basis functions on a bounded real interval.Remark 1.
Remark 2.
In applications of (14)-(15) for or (16)-(17) for the division by or may lead to numerical instability for pairs of close points and . This can be mitigated by a proper choice of the function . In the spatial case of we propose to define for some . With this choice of the divisors cancel out preventing very large or infinite variances of (14)-(17).
Remark 3.
The quantities (14)-(17) depend on the unknown intensity function. If the intensity function is constant equal to we can multiply (18) by whereby the resulting estimating equation no longer depends on . Thus can be estimated without estimating . Otherwise, the intensity function has to be estimated first, for instance in a parametric way, see Guan et al. (2015), and plugged into (14)-(17).
4 Variational orthogonal series estimation of the pair correlation function
In this section we consider the estimation of an isotropic pair correlation function on a bounded interval , and , using a series expansion of . Let denote an orthonormal basis of functions on with respect to some weight function , i.e. . Provided is square integrable (with respect to ) on , we have the expansion
(20) |
where the coefficients are defined by .
We propose to approximate by truncating the infinite sum up to some and obtain estimates using (18). The resulting estimate thus becomes
In the sequel this estimator is referred to as the variational (orthogonal series) estimator (VSE for short). The approach is related to Zhao (2018) who also considers an estimating equation approach to estimate a pair correlation function of the form (20) but for a number of independent point processes on . The approach in Zhao (2018) further does not yield closed form expressions for the estimates of the coefficients.
Orthogonal series estimators have already been considered by Jalilian et al. (2019) who expand instead of . They propose very simple unbiased estimators of the coefficients but the resulting estimator of , referred to as the OSE in the sequel, is not guaranteed to be non-negative.
4.1 Implementation of the VSE
Examples of orthogonal bases include the cosine basis with , and , . Another example is the Fourier-Bessel basis with and
where , is the Bessel function of the first kind of order and is the sequence of successive positive roots of . In the context of the variational equation (18) we need that the basis functions have non-zero derivatives in order to estimate . This is not the case for of the cosine basis. We therefore consider in the following the Fourier-Bessel basis.
Let , . The mean integrated squared error (MISE) for of the VSE over the interval is
(21) | ||||
Jalilian et al. (2019) chose by minimizing an estimate of the MISE for . We have, however, not been able to construct a useful estimate of (21). Instead we choose by maximizing a composite likelihood cross-validation criterion
where is the estimate of obtained using all pairs of points in except and . This is a simplified version of the cross-validation criterion introduced by Guan (2007a) in the context of non-parametric kernel estimation of the pair correlation function.
For computational simplicity and to guard against overfitting we choose inspired by Jalilian et al. (2019) the first local maximum of larger than or equal to two rather than looking for a global maximum. Note that when and in (18) have been obtained for one value of , then we obtain the and for by just adding one new row/column to the previous and one new entry to the previous .
4.2 Simulation study
We study the performance of our variational estimator using simulations of point processes with constant intensity 200 on or . We consider the case of a Poisson process for which the pair correlation function is constant equal to one, a Thomas process (parent intensity
, dispersal standard deviation
and offspring intensity ), a variance Gamma cluster process (parent intensity , shape parameter , dispersion parameter and offspring intensity ), and a determinantal point process (DPP) with exponential kernel and . The pair correlation functions for the four point process models are shown in Figures 2 and 3 in the usual scale as well as in the log scale. The Thomas and variance Gamma processes are clustered with pair correlation functions bigger than one while the DPP is repulsive with pair correlation function less than one. In all cases we consider and we let for Poisson, Thomas, and variance Gamma. For the DPP the log pair correlation function is not well-defined for and we therefore use in case of the DPP. We use (14) and (15) for computing and and referring to Remark 2 we let . For each point process we generate 500 simulations.4.2.1 Estimates of coefficients
Equations (14) and (15) are derived from (7) in which is the true pair correlation function. In practice, when considering a truncated version of (20), the estimating equation (18) is not unbiased which results in bias of the coefficient estimates. This is exemplified in case of the Thomas process in the left plot of Figure 1 which shows boxplots of the first two coefficient estimates when (20) is truncated to . In the right plot, (20) is truncated to which means that the truncated version of (20) is very close to the Thomas pair correlation function. Accordingly, the bias of the estimates is much reduced. However, the estimation variance increases when is increased. This emphasizes the importance of selecting an appropriate trade-off between bias an variance. The plots in Figure 1 also show how the variance of the coefficient estimates decreases when the observation window is increased from to .
4.2.2 Comparison of estimators
In addition to our new VSE, we also for each simulation consider the OSE proposed by Jalilian et al. (2019)
(using the Fourier-Bessel basis and their so-called simple smoothing scheme) and a standard non-parametric kernel density estimate (KDE) with bandwidth chosen by cross-validation
(Guan, 2007b; Jalilian and Waagepetersen, 2018). Figures 2 and 3 depict means of the simulated OSE and VSE estimates of and as well as 95% pointwise envelopes. The variational estimator has larger variability whereas the bias can be smaller or larger than the OSE depending on the model.Table 1 summarizes the root MISE (square root of (21)) for the three estimators across the four models.
Window | OSE | VSE | KDE | |
---|---|---|---|---|
Poisson | 0.027 (2.1) | 0.051 (2.2) | 0.093 | |
0.012 (2.0) | 0.024 (2.2) | 0.037 | ||
Thomas | 0.0995 (3.7) | (2.7) | 0.111 | |
0.044 (4.2) | 0.063 (2.9) | 0.053 | ||
Variance Gamma | 0.099 (6.5) | 0.148 (3.8) | 0.110 | |
0.050 (9.6) | 0.072 (5.3) | 0.057 | ||
DPP |
NA (3) | 0.1622 (3.6) | NA | |
NA (4.1) | 0.1582 (5.2) | NA |
: in this setting one replication produced an outlier and is omitted in the root MISE estimation)
The root MISEs are larger for the variational estimator than for the OSE and the KDE except in the Poisson case where the KDE has larger MISE than the VSE. Table 1 also reports the average of the selected for the variational estimator and the OSE. The averages of the selected ’s are pretty similar for the Poisson and DPP models while the OSE tends to select higher than the variational method for the Thomas and variance Gamma point processes.
We have also compared the computing time to evaluate the OSE and VSE. The OSE is generally cheaper except when the number of points and are large, see also the case of Capparis in Section 4.3.
4.3 Data example
We apply the three estimators to the data example considered in Jalilian et al. (2019). That is, we consider point patterns of locations of Acalypha diversifolia (528 trees), Lonchocarpus heptaphyllus (836 trees) and Capparis frondosa (3299 trees) species in the 1995 census for the Barro Colorado Island plot (Hubbell and Foster, 1983; Condit et al., 1996; Condit, 1998). The intensity functions for the point patterns are estimated as in Jalilian et al. (2019)
using log-linear regression models depending on various soil and topographical variables. The estimated pair correlation functions are shown in Figure
4.In case of Capparis and Lonchocarpus, VSE and OSE are quite similar while the kernel estimate is markedly different from the other estimates for small lags. For Acalypha, all estimates differ for small lags. The three estimates are very similar for large spatial lags for all species. The selected number for the VSE are 3, 9 and 5 for Acalypha, Capparis, and Lonchocarpus, while OSE selects for all species. In the case of Capparis, the computation time (4200 seconds) is higher for the OSE than for the VSE (1244 seconds) due to the high number of points for this species.
5 Discussion
In this paper we derive variational equations based on second order properties of a spatial point process. It is remarkable that in case of log-linear parametric models for the pair correlation function, it is possible to derive variational estimating equations which have closed form solutions for the unknown parameters. We exploit this to construct new variational orthogonal series type estimators for the pair correlation function. In contrast to previous kernel and orthogonal series estimators, our new estimate is guaranteed to be non-negative. For large data sets, the new estimator is further computationally faster than the previous orthogonal series estimate. However, in terms of accuracy as measured by MISE, the new estimator does not outperform the previous estimators. In the data example, the new estimator and the OSE gave similar results.
We believe there is further scope for exploring variational equations. For instance, one could use non-orthogonal bases for expanding the log pair correlation function instead of the orthogonal Fourier-Bessel basis used in this work. One might e.g. in future work consider so-called frames (Christensen, 2008) or spline bases.
Acknowledgments
Rasmus Waagepetersen’s and Francisco Cuevas-Pachecho’s research was supported by The Danish Council
for Independent Research — Natural Sciences, grant DFF – 7014-00074 ”Statistics for point processes in space and beyond”, and by the ”Centre for Stochastic Geometry and
Advanced Bioimaging”, funded by grant 8721 from the Villum Foundation. The research of J.-F. Coeurjolly is funded by the Natural Sciences and Engineering Research Council of Canada.
The BCI forest dynamics research project was made possible by National Science Foundation grants to Stephen P. Hubbell: DEB-0640386, DEB-0425651, DEB-0346488, DEB-0129874, DEB-00753102, DEB-9909347, DEB-9615226, DEB-9615226, DEB-9405933, DEB-9221033, DEB-9100058, DEB-8906869, DEB-8605042, DEB-8206992, DEB-7922197, support from the Center for Tropical Forest Science, the Smithsonian Tropical Research Institute, the John D. and Catherine T. MacArthur Foundation, the Mellon Foundation, the Celera Foundation, and numerous private individuals, and through the hard work of over 100 people from 10 countries over the past two decades. The plot project is part of the Center for Tropical Forest Science, a global network of large-scale demographic tree plots.
The BCI soils data set were collected and analyzed by J. Dalling, R. John, K. Harms, R. Stallard and J. Yavitt with support from NSF DEB021104, 021115, 0212284, 0212818 and OISE 0314581, STRI and CTFS. Paolo Segre and Juan Di Trani provided assistance in the field. The covariates dem, grad, mrvbf, solar and twi were computed in SAGA GIS by Tomislav Hengl (http://spatial-analyst.net/).
References
- Almeida et al. [1993] M.P. Almeida, , and B. Gidas. A variational method for estimating the parameters of MRF from complete or incomplete data. Annals of Applied Probability, 3(1):103–136, 1993.
- Baddeley and Dereudre [2013] A. Baddeley and D. Dereudre. Variational estimators for the parameters of Gibbs point process models. Bernoulli, 19(3):905–930, 2013.
- Baddeley et al. [2015] A. Baddeley, E. Rubak, and R. Turner. Spatial point patterns: methodology and applications with R. Chapman and Hall/CRC, 2015.
- Baddeley et al. [2000] A. J. Baddeley, J. Møller, and R. Waagepetersen. Non- and semi-parametric estimation of interaction in inhomogeneous point patterns. Statistica Neerlandica, 54:329–350, 2000.
- Christensen [2008] O. Christensen. Frames and Bases - an introductory course. Applied and numerical analysis. Birkhäuser, Basel, 2008.
- Coeurjolly and Møller [2014] J.-F. Coeurjolly and J. Møller. Variational approach for spatial point process intensity estimation. Bernoulli, 20(3):1097–1125, 2014.
- Condit [1998] R. Condit. Tropical Forest Census Plots. Springer-Verlag and R. G. Landes Company, Berlin, Germany and Georgetown, Texas, 1998.
- Condit et al. [1996] R. Condit, S. P. Hubbell, and R. B. Foster. Changes in tree species abundance in a neotropical forest: impact of climate change. Journal of Tropical Ecology, 12:231–256, 1996.
- Evans and Gariepy [1992] L.C. Evans and R.F.. Gariepy. Measure theory and fine properties of functions. Studies in Advanced Mathematics. CRC Press, Boca Raton, FL, 1992.
- Guan [2007a] Y. Guan. A composite likelihood cross-validation approach in selecting bandwidth for the estimation of the pair correlation function. Scandinavian Journal of Statistics, 34(2):336–346, 2007a.
- Guan [2007b] Y. Guan. A least-squares cross-validation bandwidth selection approach in pair correlation function estimations. Statistics & Probability Letters, 77(18):1722–1729, 2007b.
- Guan et al. [2015] Y. Guan, A. Jalilian, and R. Waagepetersen. Quasi-likelihood for spatial point processes. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 77(3):677–697, 2015.
- Hubbell and Foster [1983] S. P. Hubbell and R. B. Foster. Diversity of canopy trees in a neotropical forest and implications for conservation. In S. L. Sutton, T. C. Whitmore, and A. C. Chadwick, editors, Tropical Rain Forest: Ecology and Management, pages 25–41. Blackwell Scientific Publications, Oxford, 1983.
- Illian et al. [2008] J. Illian, A. Penttinen, H. Stoyan, and D. Stoyan. Statistical analysis and modelling of spatial point patterns, volume 70. John Wiley & Sons, 2008.
- Jalilian and Waagepetersen [2018] A. Jalilian and R. Waagepetersen. Fast bandwidth selection for estimation of the pair correlation function. Journal of Statistical Computation and Simulation, 88(10):2001–2011, 2018.
- Jalilian et al. [2019] A. Jalilian, Y. Guan, and R. Waagepetersen. Orthogonal series estimation of the pair correlation function of a spatial point process. Statistica Sinica, 2019. To appear. Available at arXiv:1702.01736.
- Møller and Waagepetersen [2004] J. Møller and R. Waagepetersen. Statistical inference and simulation for spatial point processes. CRC Press, 2004.
- Zhao [2018] C. Zhao. Estimating equation estimators for the pair correlation function. Open access dissertations 2166, University of Miami, 2018.
Appendix A Proof of Theorem 1
Proof.
Using the Campbell theorem (1) and since , we start with
Using first the invariance by translation of and , second Fubini’s theorem, and third a change of variables, this reduces to
By assumption, we have using the dominated convergence theorem,
We can now use the standard trace theorem (see e.g. Evans and Gariepy [1992]) and obtain
From (3), we deduce from the dominated convergence theorem that
Finally, using successively a change of variable and the Campbell theorem we get
which proves (4). ∎
Appendix B Proof of Theorem 2
Proof.
Both (7) and (8) are proved similarly. We focus only on (8) and follow the proof of Theorem 1. Using the Campbell theorem (1), the fact and finally a change to polar coordinates, we have
Using the dominated convergence theorem, partial integration and (6) we have
A change to polar coordinates and the Campbell theorem again lead to
∎