A Spatially Correlated Auto-regressive Model for Count Data

05/21/2018
by   Nicholas J. Clark, et al.
0

The statistical modeling of multivariate count data observed on a space-time lattice has generally focused on using a hierarchical modeling approach where space-time correlation structure is placed on a continuous, unobservable, process. The count distribution is then assumed to be conditionally independent given the latent process. However, in many real-world applications, especially in the modeling of criminal or terrorism data, the conditional independence between the count distributions is inappropriate. In the absence of spatial correlation, the Integer Auto-Regressive Conditionally Heteroskedastic (INGARCH) process could be used to capture this data model dependence however this model does not allow for any unexplained spatial correlation in the data. In this manuscript we propose a class of models that extends the INGARCH process to account for small scale spatial variation, which we refer to as a SPINGARCH process. The resulting model allows both data model dependence as well as dependence in a latent structure. We demonstrate how second-order properties can be used to differentiate between models in this class. Finally, we apply Bayesian inference for the SPINGARCH process demonstrating its use in modeling the spatio-temporal structure of burglaries in Chicago from 2010-2015.

READ FULL TEXT VIEW PDF

Authors

page 1

page 2

page 3

page 4

05/18/2022

Dependent Latent Class Models

Latent Class Models (LCMs) are used to cluster multivariate categorical ...
05/15/2018

Bayesian hierarchical modelling of sparse count processes in retail analytics

The field of retail analytics has been transformed by the availability o...
04/16/2018

Separating diffuse from point-like sources - a Bayesian approach

We present the starblade algorithm, a method to separate superimposed po...
12/01/2018

A Nonstationary Designer Space-Time Kernel

In spatial statistics, kriging models are often designed using a station...
10/21/2020

Improved inference for areal unit count data using graph-based optimisation

Spatial correlation in areal unit count data is typically modelled by a ...
02/23/2020

Bayesian analysis of count-valued, binary-valued, and continuous-valued responses using unknown transformations

Consider the situation where an analyst has a Bayesian statistical model...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The modeling of count data where each observation takes place on a space-time lattice arises in multiple disciplines. In the disease literature, the number of infected patients is often aggregated over geographic areas and discrete times to protect the confidentiality of patients. In the modeling of terrorism or criminal acts, that we consider in this manuscript, data is often presented aggregated over time and space for security reasons. Even for spatial continuous and temporally continuous data, the analysis is often performed aggregated over fixed spatial and temporal domains as a matter of convenience. The challenge, then, is how to appropriately model the relationship between observations. Assumptions on either the spatial relationship or the temporal relationship between observations are necessary if any statistical analysis is to be performed. In this paper we present a novel approach for structuring space-time dependency for count data through a combination of spatial dependence in a latent process model and temporal dependence in the data model.

In the spatial statistics literature, an early attempt at structuring spatial relationships for count data was made in Besag (1974) where the data model distribution was conditionally specified given a fixed spatial neighborhood. However, as shown in Besag (1974), this results in a statistical model that only allows negative correlation. More recently, Kaiser and Cressie (1997) demonstrated how modifications could be made to the statistical model that allowed both negative and positive correlation. A similar methodology was employed in Augustin et al. (2006) to address the spatial dynamics of seed count data in agricultural models. The critical assumption in these classes of models is the distribution of the observed counts can be conditionally specified from the observed counts at spatial neighbors, a Markov assumption in space.

Advances in computation and Bayesian inference have also allowed for modeling through spatial hierarchical models similar to the Poisson log-Normal approach of Aitchison and Ho (1989). Letting be a discrete spatial location and be discrete time, spatial-temporal dependence can be introduced by assuming the existence of a latent process, that has a spatial-temporal structure characterized by . The data model is then assumed to be independent given the latent state. The idea was extended to incorporate spatial dynamics in Besag et al. (1991). Here a spatial Markov assumption was still made, but it was made in a latent, unobserved, continuous process. The spatial observations were then assumed to be independent given the latent process. This idea was also used in Wolpert and Ickstadt (1998), who used a Poisson-gamma model with spatial dependence in the latent, gamma, structure.

The concept of allowing the spatial dependence to exist only in a latent field overcomes the difficulty of only negative spatial correlation that arises in the auto-Poisson model of Besag (1974). Although this form of modeling only allows for limited dependence in the data as demonstrated in Aitchison and Ho (1989), it has become commonplace in literature. For example in Goicoa et al. (2016) mortality rates are studied using latent conditional effects for space, time, and age. This approach also has given rise to specialized analytical techniques and software to conduct efficient inference, for example Integrated Nested Laplace Approximations of Rue et al. (2009) or spBayes of Finley et al. (2007). However, this approach assumes that the dependence in the data is due to a latent, unobservable process which does not capture repeat victimization that is believed to exist in crime and violence. Repeat victimization, explained for example in Polvi et al. (1990), is the belief that an observed crime or violent act increases the likelihood of a future crime occurring at that exact same spot or against the exact same person and can be modeled assuming a data model dependence or as an observation driven process.

While count data in the spatial statistics literature has predominately been addressed through structure in a latent process, in the time series literature it has evolved quite differently. For example, the INGARCH model of Ferland et al. (2006) and Heinen (2003) is a time series model for counts where the data model is Poisson with expectation that is a function of both previous counts and previous expectations. Specifically, if we let be a time series of counts and be the -field generated by , the INGARCH model is

(1)

This results in a time series model that is a function of both the data model and a deterministic process model. Ferland et al. (2006) demonstrated how the INGARCH(1,1), given as , is analogous to an ARMA(1,1) for counts. In Fokianos et al. (2009) it was shown that a perturbed INGARCH(1,1) model was geometrically ergodic giving a unique stationary distribution and asymptotic normality of the roots of the likelihood equations. The stationary distribution of the INGARCH(1,1) process is also equivalent to a stochastic process given in Hawkes (1971), often called a self-exciting point process, where

(2)

when the process is sampled at discrete times and . In (2), is often referred to as the background intensity and is referred to as the excitation function, see Laub et al. (2015).

While the INGARCH model was motivated to model univariate time series data, there has been some recent effort to apply it to multivariate count data. Heinen and Rengifo (2007) used copulas to model the contemporaneous correlation. However there are issues with using copulas for count data and it is generally less reliable and identifiable than the use of copulas for continuous data, as explained in Genest and Nešlehová (2007). Liu (2012) allows for a spatial lag dependency through treating

as a vector and replacing

and

with a series of matrices. The author then allows for contemporaneous correlation through the bivariate Poisson. These models, though, do not naturally extend previous spatial models to the INGARCH class nor do they capture how criminologists and others believe crime actually evolves over space and time. Furthermore, as we will show, the variance to mean ratio for the INGARCH process dictates the range for the allowable autocorrelation limiting its practical use.

In this manuscript, we introduce a class of Self-Exciting Spatio-Temporal models for count data we refer to as Spatially Correlated Integer Auto-Regressive Conditionally Heteroskedastic, or SPINGARCH, models. These models maintain many of the stationarity properties of the INGARCH process while allowing for spatial correlation through the addition of a latent log-Gaussian spatially correlated process. We will demonstrate how the models arise from assumptions on how crime and violence evolves over space and time, how they retain the same stationarity properties as the INGARCH model and how they can be differentiated through assessment of second order characteristics. The SPINGARCH model also allows a much wider range of second order properties affording the modeler more flexibility in describing the autocorrelation and variance to mean ratio of the data. We will further show how to conduct inference and model assessment to differentiate between models within this class using burglaries in Chicago as an exemplar.

2 General Model

A self-exciting spatio-temporal model is characterized through the existence of a latent process that is allowed to have spatial correlation as well as a data process that has self-excitation, or a positive feedback mechanism. To model this, first denote as the data model with as fixed spatial locations and as discrete points in time. We next introduce

as a latent random variable defined on

. Finally we define a spatial set: . Note that is normally a vector in , often times representing a fixed geographic region. For example a county in the state of Iowa may be given a single representative .

Now, letting denote the history of the data process at location up until time period , the SPINGARCH process, is conditionally Poisson and has a mass function of

(3)

where

(4)

The SPINGARCH structure consists of combining (3), which is Markovian in time and (4) which is Markovian in space. The observation is conditioned on the entire past and current latent process is modeled through the use of full conditional distribution in space at each point in time. Markov assumptions reduce the conditioning to one time step and spatial neighbors.

If we let be the neighborhood matrix such that entry if and are neighbors and restrict where is the

th largest eigenvalue of

, the resulting latent process model, at each time , is a Conditional Auto-Regressive or CAR model used in Cressie and Wikle (2015).

Letting

, the CAR model has joint distribution

where is an matrix with entries if or otherwise. is a diagonal matrix with entries . For notational convenience we define . We further take for , but these terms could be further modeled as, for example, functions of spatially varying covariates. To simplify notation, we let represent the vector of parameters, and as the latent covariance matrix. We further write the covariance between location and as with the understanding that this value will depend on whether is in the neighborhood of or not. Similarly, we let all the diagonal elements of (generally assumed to be equal) be expressed as . Alternatively, as explained in Section 2.1, under some circumstances it may be reasonable to assume that the latent process could be constant across time resulting in, for each location , a single that is still Markov in space

This model extends the INGARCH(1,1) model given in (1) by taking the leading term in that model to itself be a log Gaussian spatial process. We refer to model (3) as the Spatially Correlated Integer Auto-Regressive Conditionally Heteroskedastic, or SPINGARCH, model.

2.1 Model Motivation and Relationship to Mathematical Models of Crime

The SPINGARCH model can be written as

(5)

where . Here it is clear that

is a Markov chain on state space

. From this representation, we can see the role that the various parameters play in the model. At each spatial location, , the change in expectation, , is

(6)

If this process was used to model the number of violent events that occur at space-time location () we can, similar to the mathematical model for crime given in Short et al. (2008), think of as representing the tension at location at time . The change in tension is then a function of potentially spatially varying exogenous factors, that manifest slightly differently at each point in time according to . The term captures the expected change that is due to repeat or near-repeat actions, a characteristic of violence that has been shown to exist in the social science literature, see e.g. Polvi et al. (1990) and Pease et al. (1998) whereas captures decay in tension absent repeat victimization. The , then, can be thought of as the baseline change in tension in the absence of any endogenous factors in the model.

Alternatively, the baseline tension could be envisioned to be time invariant, that is

(7)

where

The difference between these two formulations can be seen using the recursive definitions where in (5) we can write

(8)

while in (7)

(9)

both representations are clearly examples of (2) with exponential excitement functions and differing background rates. The choice of (6) or (7) will reflect the belief as to whether the exogenous effects manifest differently in time or not. To differentiate (6) from (7) we will refer to (7) as a time invariant SPINGARCH process.

If the exogenous effects in (6) were not stochastic, however, the choice of (6) or (7) would lead to models with equivalent stationary distributions. In this case the change in tension is expressed as

(10)

and is the same model as the INGARCH(1,1) process.

3 Model Properties

Here we derive the model properties for the SPINGARCH formulation given in (5) noting that similar properties hold for the time invariant SPINGARCH process. We begin by first noting that the SPINGARCH process is geometrically ergodic and converges to a unique stationary distribution. To show this, note that the INGARCH(1,1) model, when perturbed with a random sequence that has a density on the positive real line, is geometrically ergodic and subsequently converges to a unique stationary distribution as , as proved in Fokianos et al. (2009). Now, a similar argument can be made for the SPINGARCH process as is a Markov chain on the state space and is perturbed by the multivariate log-Gaussian density. However, unlike the proof in Fokianos et al. (2009)

further perturbation is not needed as the log Gaussian spatially correlated errors allow the chain to always have a positive probability for visiting any set of positive Lebesgue measure.

Proposition 1.

Under the parameter space restriction, and

, the SPINGARCH process is geometrically ergodic and admits a unique stationary distributions that has finite first two moments.

A complete proof of Proposition 1 follows closely the development in Fokianos et al. (2009) relying on Markov chain theory and is given in Appendix A. As a result of proposition 1, we can use the stationary distribution to derive first and second order properties for the SPINGARCH (1,1) model. The second order properties will enable us to differentiate between the INGARCH (1,1) and the SPINGARCH as well as between different SPINGARCH formulations.

3.1 First Order Properties

To derive the expectation for data from the Self-Exciting Poisson CAR model, we first note that

has a multivariate log-normal distribution. Therefore, as the natural parameter is linked exponentially with the linear predictor, using properties of the Poisson distribution, we have

(11)

which, at stationarity, yields, . The existence of self-excitation within the data model, or , increases the marginal expectation for the data model in a manner similar to the INGARCH(1,1) model.

3.2 Second Order Properties

The SPINGARCH model allows for flexible modeling of variances. In particular, the variance to mean ratio can be manipulated without impacting the autocorrelation in time, which distinguishes it from an INGARCH(1,1) model. In addition, spatial structure may be modeled through a combination of large-scale and small-scale spatial processes.

3.2.1 Variance

To see how the variance to mean ratio can be adjusted under the SPINGARCH model we can first compute the marginal variance of . To find this value we exploit the independence of and yielding

(12)
(13)
(14)

Under the conditions in Proposition 1 we have second order temporal stationarity and subsequently,

(15)

Therefore, similar to the INGARCH(1,1) process, the SPINGARCH process allows for the modeling of overdispersion. Furthermore, as we will show below, overdispersion can be accounted for without impacting the range of possible autocorrelation.

3.2.2 Temporal Covariance

To see the impact of adjusting the mean to variance ratio on the temporal covariance we find the lag-one autocorrelation by relying on the second order stationarity implicit in Proposition 1. As derived in Appendix B the autocovariance under the SPINGARCH model is

(16)

In particular, if in (5), the lag-one autocorrelation for the process is and in general, the lag- autocorrelation is .

The significance of this is that it allows a great deal of flexibility in capturing second order properties of the data. The SPINGARCH process with , for example, has a lag-one auto correlation of , and a variance to mean ratio of . Therefore, through manipulating the we can manipulate the variance to mean ratio without impacting the autocorrelation. If, for example, we desire data that has a variance to mean ratio of 2, the SPINGARCH process with could have a lag-1 autocorrelation between 0 and 0.707. The INGARCH (1,1), on the other hand, with a variance to mean ratio of 2, must have an autocorrelation between 0.5 and 0.707.

3.2.3 Spatial Covariance and Correlation

The SPINGARCH model also allows for limited spatial correlation Recalling that is the marginal covariance between and , the spatial covariance between and is

(17)

A proof of this is is given in Appendix B. From (17) it is clear that the spatial covariance is zero if the marginal covariance between and is zero. However, if there is a non-zero marginal covariance between the spatial locations, and influence the spatial correlation in the data. As can be either positive or negative, the spatial covariance, unlike the temporal covariance, can be either positive or negative.

The spatial correlation is therefore

(18)

The spatial correlation, as seen in (18), only depends on and through the expectation of , however this is a potential limitation as this implies that the range of correlations depends on values of parameters other than the parameters of the CAR process.

4 Bayesian Inference

Bayesian analysis of the SPINGARCH model can be accomplished through application of the techniques suggested by Joseph (2016). Letting the generically represent a density and generically represent a conditional density, the joint prior distribution of the parameters in the model can be expressed as where we assume independence in our priors except for and due to the restriction that . Now, the full condition of is

(19)

and the full conditional of is

(20)

In order to do any form of Markov Chain Monte Carlo inference we must sample from the density of the full latent state, which requires evaluations of

(21)

Note that as our neighborhood structure is assumed to be constant is the same for all time periods so we can write as the full space-time covariance matrix . The sparsity of the covariance structure means that the only computations of that need to occur are for spatial neighbors. Therefore, the most difficult part of the computation of the log-density is the computation of the determinant, . However, the complicated notation of the covariance structure belies the fact that the precision matrix is both block diagonal and extremely sparse that greatly simplify computations of (21). The specific structure for allows us to follow Jin et al. (2005). In particular, we have where is the neighborhood or adjacency matrix. Letting be the spectral decomposition of we have where are the eigenvalues of the neighborhood matrix. Also, as is block diagonal with each block being size and having structure , it follows that

(22)
(23)

The greatest advantage of this approach is that the eigenvalues of the neighborhood matrix, depend only on the neighborhood structure and do not depend on any parameters, therefore they can be computed ahead of time. This means that we never need to deal with matrices of the size of , rather we just need to find the eigenvalues for the neighborhood matrix. This allows the model to be fit relatively quickly using a Gibbs algorithm combined with Hamiltonian Monte Carlo or similar methods currently implemented in the software package Stan Carpenter et al. (2016).

To conduct model assessment under the above framework we rely on posterior predictive P values, see e.g.

Gelman et al. (1996). This technique samples new data sets after sampling parameters from the posterior distribution. Statistics are calculated on the new data and compared to the statistics from the original dataset. For each data set we calculate , the spatial Moran’s I statistic, , the log of the variance to mean ratio, , the variance of and , the average sample lag-one Auto-regression. and capture the spatial and temporal structure in the data while and capture the variability and roughness at each location. is intended to quantify how smooth the process is over time with a higher value of corresponding to a rougher process. As we will see in Section 5, no one of these statistics suffices to differentiate between the SPINGARCH, the time invariant SPINGARCH, and the INGARCH model. Rather, the examination of all four posterior predictive checks is generally needed in order to properly differentiate the models from one another and values of the checks much greater than or less than 0.5 should cast doubts on the fit model.

5 Simulation

The purpose of the simulations is to understand the impact of misspecifying the statistical model and to demonstrate how the above mentioned posterior predictive checks can assist in conducting model selection between the SPINGARCH, the time invariant SPINGARCH, and the INGARCH process. The spatial domain we use in the simulations is a 20 20 grid wrapped on a torus using a four nearest neighbor structure and used 100 time points.

We first assume that the generating mechanism is

(24)

The model given in (24) assumes that the change in expectation at each time period is due to the previous counts, previous expectation, and a spatially varying factor that does not change over time. This is an example of the formulation given in (7). After simulating from this model we fit a SPINGARCH model, (5), a time invariant SPINGARCH model, eqrefeq:SpaceOnly, and an INGARCH(1,1) model, (10

). Credible intervals for the parameters of each fit are given in Table

1.

SPINGARCH
Time Invariant
SPINGARCH
INGARCH
(-7.9, -6.5) (-1.0, 0.34) (-1.8,-1.7)
(0.08,0.09) (0.17,0.21) (0.38,0.40)
(0.90,0.92) (0.23,0.36) (0.56,0.58)
(0.79,1.00) (0.36,0.69) -
(0.248,0.249) (0.238,0.249) -
Table 1:

95% Credible intervals for parameter estimates using (

24) as generating mechanism.

From Table 1 clearly the true model, the time invariant SPINGARCH, covers the generating parameters while both the SPINGARCH and the INGARCH model inflate . Further, the SPINGARCH process inflates and . We next used the posterior distribution to calculate posterior predictive checks given in Table 2

Posterior Predictive
Checks
SPINGARCH
Time Invariant
SPINGARCH
INGARCH
Moran’s I 0.37 0.32 0
Var Mean Ratio 0.26 0.38 0
Var of Difference 0.2 0.58 0.04
AR(1) 0.90 0.58 1
Table 2: Posterior predictive checks simulating from the posterior distribution.

From Table 2 we see that when the data are fit to the true model the resultant posterior distribution is able to replicate data that has similar characteristics as the original data. However, this is not the case if the model is misspecified. While the INGARCH is unable to replicate any of the second order properties of the original data, the SPINGARCH does well for all except the AR(1) statistic. While in this case the SPINGARCH and the time invariant SPINGARCH seem to be able to generate very similar second order properties, we next show that when is near the edge of the parameter space this is no longer the case.

We next assume that there is very little self-excitation, and a lower spatial correlation . This generating mechanism is

(25)
SPINGARCH
Time Invariant
SPINGARCH
INGARCH
(-9.0, -8.7) (-0.34,0.58) (-2.3,-2.2)
(0.01,0.02) (0.04,0.06) (0.26,0.28)
(0.98,0.99) (0.25,0.34) (0.69,0.70)
(0.69,0.85) (0.47,0.62) -
(0.249,0.249) (0.219,0.248) -
Table 3: 95% Credible intervals for parameter estimates using (25) as generating mechanism.
Posterior Predictive
Checks Values
SPINGARCH
Time Invariant
SPINGARCH
INGARCH
Moran’s I 0.10 0.32 0
Var to Mean Ratio 0.04 0.38 0
Var of Difference 0.02 0.43 0.02
AR(1) 0.91 0.61 1
Table 4: Posterior predictive checks simulating from the posterior distribution.

The results shown in Tables 3 and 4 demonstrate that even though very little self-excitation is present in the generating mechanism, if the incorrect INGARCH model was used for inference, self-excitation would appear to be present in the data.

Finally, we consider a smaller, 10 10 spatial domain and assume that the generating mechanism is the full SPINGARCH model,

(26)

The results to fitting all three models are shown in Tables 5 and 6.

SPINGARCH
Time Invariant
SPINGARCH
INGARCH
(-0.02, 0.19) (0.67,0.83) (0.66,0.76)
(0.17,0.23) (0.35,0.37) (0.35,0.39)
(0.25,0.33) (0.01,0.07) (0.03,0.08)
(0.43,0.53) (0.01,0.03) -
(0.243,0.246) (0.09,0.249) -
Table 5: 95% Credible intervals for parameter estimates using (24) as generating mechanism.
Posterior Predictive
Checks
SPINGARCH
Time Invariant
SPINGARCH
INGARCH
Moran’s I 0.38 0 0
Var to Mean Ratio 0.43 0 0
Var of Difference 0.56 0 0
AR(1) 0.54 0.32 0.36
Table 6: Posterior predictive checks simulating from the posterior distribution.

As shown in Table 5 there is vary little difference between the time invariant SPINGARCH and the INGARCH model if the true data generating mechanism is akin to (26). Interestingly in this case, if we used the average Auto-regressive (1) coefficient for our posterior predictive checks, the INGARCH and time invariant SPINGARCH had values of 0.30-0.35 suggesting that if all we care about is our ability to replicate the AR(1) properties of our data then all three formulations would suffice.

The three simulations make the following points clear:

  • Posterior predictive checks need to account for all second order properties in order to properly distinguish between the considered models

  • Care needs to be taken prior to attributing meaning to parameters such as claiming represents repeat-victimization. As seen in fitting the INGARCH model to (25) can increase to attempt to capture other characteristics of the data.

  • Fitting the incorrect model often inflates the parameters such as when the data are generated from model (24) and fit to the SPINGARCH both and are greatly inflated

6 Burglaries in Chicago

As a case study we consider a statistical model for burglaries in the south side of Chicago during 2010-2015 using crime data from the city of Chicago. As Chicago is one of the most racially and socio-economically segregated cities in America, we consider only the southside, a relatively racial and socio-economic homogeneous region depicted in figure 1. We aggregated the number of burglaries by Census block group and by month. Within the south side of Chicago there are 552 census block groups resulting in a spatial domain of and temporal domain of . The neighborhood structure we used in the analysis was and were considered neighbors of the census blocks associated with the locations shared a border.

Figure 1: 552 Census block groups in South Chicago. This area of Chicago is relatively racially and economically homogeneous

While the racially homogeneity eliminates some sources of socio-economic variability in the data, it does not eliminate all of it. To account for this, we further consider covariates that address unique socio-economic and population characteristics for each region. Unemployment, for example, has long been shown to have a relationship with crime, see e.g. Britt (1994) and Raphael and Winter-Ebmer (2001), the later showing property crime in particular has a strong relationship with unemployment. All potential covariates were obtained from the U.S. Census Tiger data available at https://www.census.gov/geo/maps-data/data/tiger-data.html. The maximum number of burglaries in a month in a census block for this subset of the city is 17. The variance to mean ratio in the data is 1.8, suggesting there is some overdispersion in the data. There is both temporal and spatial clustering as evident by the average lag-one autocorrelation, .32, and the Moran’s I statistic of .20. There is a clear seasonality trend in the data as well as a general downward trend from 2010-2015. In order to account for this we preprocessed the data to remove the seasonality effect and the trend prior to estimating the impact of the spatial covariates and the process covariates.

6.1 Self-Exciting Spatially Correlated Model for Chicago Burglaries

In order to model this data we consider a SPINGARCH model structured as,

(27)
(28)

If we again conceptualize as the tension at location and month, , this model says that the change in tension is due to five components. The first is a baseline tension at location that can be explained through exogeneous covariates, . How manifests itself at each time point is due to some small scale spatial structure captured in as well as residual variance captured in . The change in tension is also due to repeat victimization, , as well as a natural decay over time captured in .

To model the baseline tension at location using available data from the US Census bureau, letting , we set

(29)

As we are using a spatial structure based on Census block groups we must account for the fact that each location has a different number of spatial neighbors. To adjust for this we use the weighted CAR model of Besag et al. (1991) for . This spatial process assumes that the latent conditional variance for each location in the CAR model is . This process has a joint density given in (28) letting be a diagonal matrix with entry equal to the number of neighbors of location . Recalling that is the matrix that has entries in position if and are geographically adjacent, the parameter space of under this formulation is . The term, then, captures the location specific variability common at all times. To account for the fact that along much of the parameter space of the model is nearly unidentifiable we fix near the edge of the parameter space () (see Wall (2004) for more issues on identifiability of the spatial parameter in a CAR model).

To complete the Bayesian inference we further need to place priors on all parameters in the model, except for which we have fixed as above. In order to minimize the impact of the prior selection on the posterior densities we select diffuse proper priors for and and conducted sensitivity analysis to determine that the choice of prior had minimal impact on the results.

6.2 Alternate Self-Exciting Model for Chicago Burglaries

An alternate model that describes the spread of burglaries is similar to the model of Short et al. (2008). We might assume that the change in the rate of burglaries at a location is a function of a base attractiveness due to unique geographical features at that location, as described in (29), a natural decay over time, , and repeat victimization, . In that model, the authors further considered a spatial spread, parameterized by and motivated through a reaction-diffusion difference equation. These assumptions lead to the (stochastic) difference equation

(30)

However, in when applied to the Chicago dataset, was found to be zero, potentially due to the choice in how we aggregated time and space. Letting (30) is clearly an INGARCH(1,1) model with .

6.3 Model Fit and Comparison

95% credible intervals found from fitting (27) using the procedure outlined in Section 5 and a similar Bayesian inference for (30) are given in Table 7. Note that the SPINGARCH took approximately 13 hours to fit using 3 Markov chains of 7000 iterations each using the rStan package of Carpenter et al. (2016). and visual examination of the chains indicated no evidence that they had not converged. Divergence transitions for the Markov chains were also checked and eliminated.

Parameter Model (27) INGARCH(1,1)
(-3.3,-1.0) (-4.2,-3.4)
(0.11,0.34) (0.33,0.46)
(-0.75,0.17) (0.06,0.09)
(0.05,0.16) (-0.04,0.01)
(0.006,0.07) (0.002,0.03)
(0.04,0.07) (0.22,0.24)
(0.31,0.39) (0.44,0.48)
(0.40,0.54) -
(0.40,0.47) -
Table 7: 95% Credible Intervals for parameters of the SPINGARCH model given in (27) and INGARCH(1,1) models applied to the Chicago burglary data.

The SPINGARCH parameters suggest that the process considered manifests itself differently at each unique spatial-temporal location even when residual spatial variation is accounted for in the CAR model. In other words, there may exist small-scale spatial effects that are captured in the CAR model as well as unique characteristics of each location that the CAR model does not fully explain.

Model (27) INGARCH(1,1)
- Moran’s I Statistic 0.52 0
- Variance to Mean Ratio 0.67 0
- Variance of 0.60 0.10
- AR(1) Value 0.56 0.23
Table 8: Posterior Predictive Checks for Chicago Burglary Data

In order to examine the impact of including the spatial structure we again calculate posterior predictive P-values given in Table 8. As seen in the posterior predictive P values, the INGARCH process given in (30) is not able to capture the spatial structure nor the variance to mean ratio in the data. Therefore, the large scale effects considered inadequately capture the total spatial structure in the data. The model given in (27) does a much better job of capturing the spatial and temporal correlation as well as the variance to mean ratio. To see how well the SPINGARCH model given in (27) replicates other characteristics of the data we also conducted posterior predictive checks of the maximum observed value (p=0.78 vs p=0 for INGARCH) and the number of zeros in the data set (p=0.40 vs p=0.20 for INGARCH). Finally, we note that if the only posterior predictive check that was performed was on the AR(1) statistic, both the SPINGARCH and the INGARCH models perform adequately (p=0.56 and p=0.23). Overall, while it is plausible that a model akin to (27) was the true generating mechanism, it is highly unlikely that the INGARCH model given in (29) generated the data.

Similarly to what was demonstrated in the second simulation, the potential largest implication of the above analysis is that if the INGARCH(1,1) model was fit, one would be tempted to conclude that there exists significant self-excitement due to the high value which is not present in the SPINGARCH formulation. This may lead to the potentially erroneous conclusion that there is repeat victimization present in the data. However, since the SPINGARCH model appears to fit the data better, the apparent self-excitement may actually be misspecification of the error structure in the model. As the self-excitement parameter captures either repeat burglaries or burglaries motivated by a previous successful action, concluding the existence of self-excitement may have policy implications if the model was used in practice. Contrary to this, previous research in Polvi et al. (1991) suggested that a burglarized home had an elevated risk of another burglary within six weeks, however the elevated risk, may in fact, be explained through unexplained spatial correlation as demonstrated in our analysis. Policy implications of concluding repeat victimization, or self-excitement, is present in an area are discussed in Pease et al. (1998).

While not intending to be a complete treatment of Burglary in Chicago, the above demonstration does show that the SPINGARCH process has practical use in extending the INGARCH process to sociological phenomena such as the modeling of crime and failure to account for spatial correlation may result in potentially misleading conclusions regarding the existence of self-excitement. R code for fitting both the INGARCH model and the SPINGARCH model given in (27) is available at https://github.com/nick3703/Chicago-Data.

7 Discussion

In this manuscript we formulated a statistical model that contains both latent structure spatial dependence and data model temporal dependence extendeding earlier work of Ferland et al. (2006), Fokianos et al. (2009), and Davis and Liu (2016). We also demonstrated how such models can arise from stochastic difference equations where the number of new arrivals into a process are no longer static but rather are themselves stochastic and how these assumptions are consistent with beliefs on how violence and crimes evolve over space and time.

The resulting SPINGARCH model is novel in its combination of both data model and latent model dependence and greatly extends the uses of the INGARCH(1,1) process. Though the SPINGARCH process unique combines common models from the spatial literature an the time series literature, there are a few other notable models that are similar. In Martínez-Beneito et al. (2008) the count of diseases was modeled on a space time lattice. This model assumed that the number of infected individuals was conditionally Poisson where the natural parameter was structured to be a Log INGARCH (1,0) combined with a latent process model. The latent process model was then conditionally specified similar to a spatial conditional auto-regressive CAR model and a temporal auto-regressive CAR model,

(31)
(32)
(33)

Here the log-relative risk at location and time , is a linear function of a latent Gaussian conditionally autoregressive term (CAR) in space, a latent Auto-regressive (AR) term in time, as well as a function of the previous log relative risk, .

In Mohler (2013), a discretized Cox-Hawkes model was presented that is an INGARCH(0,q) combined with a latent log-Gaussian process where the Gaussian process follows an Auto-Regressive (1) process,

(34)
(35)

The SPINGARCH model, the Martínez-Beneito et al. (2008) model, and Mohler (2013) the model are each justified through the assumed existence of two separate processes that impact the expectation, or the log-expectation. Mohler (2013) used a temporal AR(1) latent process that impacts the expectation as well as a ’self-exciting’ proces. Similarly, Martínez-Beneito et al. (2008) used a latent Spatio-temporal CAR process combined with a data driven process that impact the log-expectation of the Poisson.

The addition of covariates in a self-exciting spatial point process was used to model crime in Reinhart and Greenhouse (2018). There, the authors considered a more general marked self-exciting point process,

(36)

where was a Gaussian kernel and was the type of criminal activity. In that manuscript, the authors found that omitting a spatial covariate impacted conclusions on self-excitation. In our work, we found more generally incorrectly specifying the structure of the background rate also biases findings regarding self-excitation. As pointed out in Reinhart and Greenhouse (2018), if the basis of our inference is on the covariates, the inclusion of the spatially correlated or term may bias covariate terms.

Lastly, as evident in the example, the restriction to a spatial CAR process for the latent variable is not necessary in practice. Oftentimes a CAR specification is preferable as it is easier to model real-world phenomena conditionally. On the other hand, in Clark and Dixon (2018), both a Simultaneous Auto-Regressive (SAR) and a Vector Auto-Regressive (VAR) specification were used to model the latent variable. In the former case, the model properties derived above will still hold, however in the later case the latent state also contained temporal covariance. It is not immediately obvious that the ergodic properties of the model still exist if the latent process is allowed to propagate over time in this manner making derivations of the second order process, then, more difficult.

Appendix A Proof of Proposition 1

We will first make use of the result given in Athreya and Pantula (1986) that states for an AR(1) process, , given as

(37)

where are a sequence of random, i.i.d., variables, a sufficient condition for the existence of a unique stationary distribution is . This is extended to Vector AR(1) models in Zeevi and Glynn (2004).

Due to the temporal independence of , we note that the latent process, , for the SPINGARCH(1,0) process can be written as the VAR(1) process

(38)

where . Here, . Thus, as we can appeal to Proposition 2 of Zeevi and Glynn (2004) and conclude that admits a unique stationary distribution, and converges in distribution to as .

We next prove geometric ergodicity, and hence stationarity,for under a more general formulation

(39)

First we note that by recursion we can write

Intuitively this suggests that the impacts of the initial condition decay at an exponential rate and both the log-Gaussian and the Poisson errors further decay at a geometric rate. The general proofs of geometric ergodicity for INGARCH properties either follow Davis and Liu (2016) and rely on showing a geometric moment contraction condition, or follow Fokianos et al. (2009) and show a drift condition and associated small set condition. The model given in (39) cannot easily be shown to satisfy the geometric moment contraction condition as for , so therefore we will closely follow Fokianos et al. (2009) and show that a drift condition holds off of a compact set, , then show that is a small set, see e.g. Meyn and Tweedie (2009).

A precursor to geometric ergodicity is the notion of -irreducibility. -irreducbility, as defined in Meyn and Tweedie (2009), formally is that there exists a measure, , such that for all Borel sets, , such that , , where is the Markov chain beginning at and is the hitting time of the Markov chain. In other words, -irreducibility means that for any set that has positive measure, the Markov chain has a positive probability of eventually entering the set.

-irreducible further implies and is implied by the condition that for every set with , for some . To demonstrate this, consider the sequence, which occurs with positive probability due to the conditional Poisson density of