Bayesian Nonparametric Nonhomogeneous Poisson Process with Applications to USGS Earthquake Data

07/06/2019 ∙ by Junxian Geng, et al. ∙ 0

Intensity estimation is a common problem in statistical analysis of spatial point pattern data. This paper proposes a nonparametric Bayesian method for estimating the spatial point process intensity based on mixture of finite mixture (MFM) model. MFM approach leads to a consistent estimate of the intensity of spatial point patterns in different areas while considering heterogeneity. An efficient Markov chain Monte Carlo (MCMC) algorithm is proposed for our method. Extensive simulation studies are carried out to examine empirical performance of the proposed method. The usage of our proposed method is further illustrated with the analysis of the Earthquake Hazards Program of United States Geological Survey (USGS) earthquake data.



There are no comments yet.


page 4

page 14

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Earthquake analysis is a widely discussed topic in the field of seismology and dates back as early as 1894 (Omori, 1894). Existing literatures (Schoenberg, 2003; Charpentier and Durand, 2015; Hu and Bradley, 2018; Nas et al., 2019; Yang et al., 2019) discussed spatial patterns and important covariates of the occurrence and the magnitude of earthquakes. Most earthquakes occur in seismic belt which is the narrow geographic zone on the Earth’s surface. This spatial feature indicates the potential heterogeneity of the earthquake activities over the space. Dasgupta and Raftery (1998) considered the problem of detecting features, such as minefields or seismic faults, to show the heterogeneity of the earthquakes between sub-areas. Not only the earthquake activities but also locations of the tree species will show the heterogeneity pattern over space. For the data with heterogeneity features among sub-areas, we naturally consider it as a clustering problem in statistical analysis.

Spatial point process model assumes that the randomness is associated with the locations of the points, which is a natural model for the earthquake data. It has been developed for analyzing spatial point pattern data (Moller and Waagepetersen, 2003; Diggle, 2013). A common problem in statistical analysis of spatial point patterns is to investigate the intensity of spatial point patterns. Traditional parametric estimation approaches are discussed in Diggle (2013) and Moller and Waagepetersen (2003). Baddeley et al. (2012) described nonparametric (kernel and local likelihood) methods for estimating the effect of spatial covariates on the point process intensity. They assumed that the point process intensity is a function of the covariates, and studied nonparametric estimator of this function. In addition to the frequentist approaches, existing literatures also discussed the Bayesian approaches for spatial point process. Leininger et al. (2017) proposed a full Bayesian model for estimating the intensity of spatial point process and considering model criticism and model selection both in-sample and out-of-sample. Shirota and Gelfand (2017) proposed an approximate Bayesian computation (ABC) for the determinantal point process. While most existing literatures focus on intensity estimation of spatial point pattern, people pay less attention to clustering structure detection along with the intensity estimation.

Motivated by the features of the earthquake data and the limitations of the existing methods previously discussed, this paper introduces a Bayesian nonparametric estimation of nonhomogenous Poisson process to capture the heterogeneity pattern in the data.

Bayesian inference provides a probabilistic framework for simultaneous inference of the number of clusters and the clustering configurations, although the case of unknown number of clusters poses computational burdens. In a fully Bayesian framework, complicated searching algorithms in variable dimensional parameter space such as the reversible jump MCMC algorithm (Green, 1995), assign a prior on the number of clusters which is required to be updated at each iteration of an MCMC algorithm. Those algorithms are difficult to implement and automate, and are known to suffer from lack of scalability and mixing issues.

Bayesian nonparametric approaches such as the Chinese restaurant process (CRP; Pitman, 1995) offer choices to allow uncertainty in the number of clusters. It has been empirically and theoretically observed that CRPs often have the tendency to create tiny extraneous clusters (Miller and Harrison, 2018). We instead use the mixture of finite mixture (MFM) approach of Miller and Harrison (2018) which prunes the tiny extraneous clusters and, consequently, estimates the number of clusters consistently. The consistency on the number of clusters from MFM has been shown in Miller and Harrison (2018); Rousseau and Mengersen (2011); Geng et al. (2018). Moreover, the MFM model has a Pólya urn scheme similar to the CRP which is exploited to develop an efficient MCMC algorithm. In particular, we obtain an efficient Gibbs sampler by analytically marginalizing over the number of clusters and thus avoid complicated reversible jump MCMC algorithms or allocation samplers.

The main contribution of this work is that we introduce a nonparametric Bayesian approach based on MFM for simultaneous inference of the intensity of spatial point pattern and the clustering information (number of clusters and the clustering configurations) over the space. Furthermore, an efficient MCMC algorithm is proposed for our method based on Gibbs sampler. Our approach avoids sampling from complicated reversible jump MCMC algorithms or allocation samplers. In addtion, our proposed Bayesian approach reveals some interesting features of the earthquake data set.

The rest of the article is organized as follows. We start with a brief introduction of the data we use in Section 2. In addition, a review of the nonhomogeneous Poisson process (NHPP) model and MFM approach are discussed in Section 3.1 and Section 3.2, and then our Bayesian nonparametric intensity estimation model is proposed for nonhomogeneous Poisson process based on MFM in Section 3.3. Furthermore, we present priors and posteriors and develop a Markov chain Monte Carlo (MCMC) sampling algorithm in Section 4. Simulation studies and comparisons with existing methods are provided in Section 5. In Section 6, the proposed method is employed to analyze the real data set of USGS. We conclude the article with some discussion in Section 7. For ease of exposition, additional technical results are given in an appendix.

2 USGS Earthquake Data

We consider the earthquake data from USGS, the Earthquake Hazards Program of United States Geological Survey (USGS), which can be accessed via, as our real data illustration. The dataset that we use for this analysis contains worldwide earthquakes which have magnitude over four from 10-01-2018 to 12-31-2018. This is mainly due to the belief that earthquakes with the magnitude above four will have some impacts on human daily life.

The total number of earthquakes in our dataset is 7701. The map of locations of the earthquakes we analyze is shown in the left panel of Figure 1. In order to analyze location relationship properly, we transform the latitude and longitude of the earthquakes to a square. The locations of earthquake in a unit square are then shown in the right panel of Figure 1. From the left panel in Figure 1, we see that there are nearly 90% of the world not having any occurrences of the earthquakes during the study period. For the North America region, most earthquakes occurred in Alaska’s central coast, extending north to Anchorage and Fairbanks, and the coast from British Columbia to the Baja California Peninsula, where the Pacific plate rubs against the North American plate. South America earthquakes stretch the length of the continent’s Pacific border. For the Asia area, most earthquakes occurred in where the Australian plate wraps around the Indonesian archipelago and also in Japan.

The observations above clearly indicate the heterogeneity of the earthquake activities over the space.

Figure 1: Map of Locations of the Earthquake (Left: Original Map; Right: Location in Unite Square)

3 Methodology

3.1 Spatial Poisson Process

A natural model for the earthquake data is a spatial point process model which assumes that the randomness is associated with the locations of the points. Let be the set of locations for points that are observed in a bounded region , which is a realization of spatial point process . This is called a spatial point pattern. The process is a counting process associated with the spatial point process , which counts the number of points of for area . For the process , there are many parametric distributions for a finite set of count variables like Poisson processes, Gibbs processes, and Cox processes (Diggle, 2013). In this work, we focus on spatial Poisson processes. For the Poisson process over with intensity function , , where . In addition, if two areas and are disjoint, then and are independent, where and . Based on properties of the Poisson process, we obtain . When , we have constant intensity over the space , and in this special case, reduces to a homogeneous Poisson process (HPP). In more general cases, can be spatially varying, which leads to a nonhomogeneous Poisson process (NHPP). For the NHPP, the likelihood on is given by


where is the intensity function for location .

3.2 Nonparametric Bayesian methods in Spatial Poisson Process

In order to simplify the problem induced by non-homogeneity on intensity values, a common approach provided by Teng et al. (2017) is to divide the spatial area into disjoint sub-areas such that we can make the assumption that the intensity is constant over each sub-area.

A commonly used approach to simplify the problem induced by non-homogeneous intensity is to partition a spatial area into disjoint sub-areas and assume constant intensity over each sub-srea (Teng et al., 2017). The number is usually referred to as the pixel resolution or partition number over the space.t Let be a partition of , i.e., they are disjoint subsets such that . For each region , we have constant intensity over region . The likelihood in (1) is written as:



is the probability mass function of the Poisson distribution. In later sections, we use

to denote .

A latent clustering structure provides the ability to accommodate the heterogeneity on intensity values for each sub-area . Let denote all possible clusterings of sub-areas into clusters, where denotes the cluster assignment of the th sub-area. There are two problems to solve: the number of clusters and the cluster assignment for each sub-area.

Under the frequentist framework, a two-stage procedure can be implemented where we first estimate the number of clusters and then estimate the cluster assignments based on the cluster number. Such two stage procedures may ignore uncertainty of the estimation of the number of clusters in the first stage, and are prone to increased erroneous cluster assignments in the second stage.

In contrast, Bayesian models offer a natural solution to simultaneously estimate the number of clusters and cluster assignments. The Chinese restaurant process (CRP; Pitman, 1995; Neal, 2000) offers choices to allow for uncertainty in the number of clusters by assigning a prior distribution on . In the CRP, are defined through the following conditional distribution (i.e., a Pólya urn scheme, Blackwell et al., 1973)


Here refers to the size of cluster labeled , and is the concentration parameter of the underlying Dirichlet process. At time , the trivial partition is obtained with probability . At time , the th element is either i) added to one of the blocks of the partition , where each block is chosen with probability , or ii) added to the partition as a new singleton block, with probability . Here  denotes a partition of the set . Let denote the number of blocks in the partition . Under (3), one can obtain the probability of block-sizes of a partition as


It is clear from (4) that the CRP assigns large probabilities to clusters with relatively smaller sizes, which results in producing extraneous clusters in the posterior leading to inconsistent estimation on the number of clusters even when the sample size goes to infinity. Miller and Harrison (2018) proposed a modification to the CRP, which is called a mixture of finite mixtures (MFM) model, to circumvent this issue:


where is a proper probability mass function on , and is a point-mass at . Miller and Harrison (2018)

showed that the joint distribution of

under (5) admit a Pólya urn scheme akin to the CRP:


where is a coefficient of partition distribution that need to be precomputed,

where , and . (By convention, and ). Compared to the CRP, the introduction of new clusters is slowed down by a factor , thereby pruning the tiny extraneous clusters.

An alternative way to understand the natural pruning of extraneous clusters is through the probability distribution induced on the block-sizes

of a partition with under MFM. In contrast to (4), the probability of the cluster sizes under the MFM is


From (4) and (7), it is easy to see that MFM assigns comparatively smaller probabilities to highly imbalanced cluster sizes. The parameter  controls the relative size of the clusters; small favors lower entropy ’s, while large favors higher entropy ’s.

3.3 MFM for Spatial Poisson Process

Adapting the MFM to the NHPP setting, the proposed model and prior can be expressed hierarchically as:


where is the number of areas in the sample space, is the number of clusters and is the number of points in area . A default choice of is a distribution truncated to be positive (Miller and Harrison, 2018), which is assumed through the rest of the paper. We refer to the hierarchical model above as MFM-NHPP.

4 Computation

4.1 The MCMC Sampling Schemes

Our goal is to sample from the posterior distribution of the unknown parameters , and . The sampler is presented in Algorithm 1, which efficiently cycles through the full conditional distributions of for and , where . The details of the full conditional distributions are in Appendix A. The main trick in the MFM approach (Miller and Harrison, 2018) for clustering is to analytically marginalize over the distribution of and exploit the Pólya urn scheme to develop an efficient Gibbs sampler. The marginalization over can avoid complicated reversible jump MCMC algorithms or even allocation samplers.

1:procedure c-MFM-NHPP
2:Initialize and .
3:     for each iter to M  do
4:Update conditional on in a closed form as
Where , . Here is the number of clusters formed by current .
5:Update conditional on , for each in , we can get a closed form expression for :
where denotes the partition obtained by removing and
6:     end for
7:end procedure
Algorithm 1 Collapsed sampler for MFM-NHPP

4.2 Inference of MCMC results

The estimated parameters including cluster assignment , intensities are determined for each replicate from the best post burn-in iteration selected using the Dahl’s method (Dahl, 2006).

Dahl (2006) proposed a least-squares model-based clustering for estimating the clustering of observations using draws from a posterior clustering distribution. In this method, we need to get the membership matrices for each iteration as , in which is the number of posterior samples obtained after burn-in iterations. Membership matrix is defined as:


where for all , and means observations and are in the same cluster in a certain iteration. Then we calculate the least squares distance to Euclidean mean for each MCMC iteration and choose the the best of these iterations. The procedure can be described as below:

  • Calculate the Euclidean mean for all membership matrices .

  • Find the iteration that has the least squares distance to as:


The least-squares clustering has the advantage that it uses information from all the clusterings via the pairwise probability matrix and is intuitively appealing because it selects the “average” clustering instead of forming a clustering via an external, ad hoc clustering algorithm.

4.3 Model Assessment

In this section, we discuss two model assessment criteria for spatial point process model. First, we introduce an in-sample model assessment criteria to assess the intensity fitness of point process model. Let be a partition of , i.e., disjoint subsets such that . The mean absolute error (MAE) is defined as:


where is the estimated intensity of the region and is the observed points of the region . Under the model assessment framework, the model with smaller MAE value has better fitness.

The aim of the second criterion, logarithm of the Pseudo-marginal likelihood (LPML; Gelfand and Dey, 1994), is to evaluate the region resolution in our Bayesian nonparametric estimation. The LPML is defined as


where is the conditional predictive ordinate (CPO) for the -th subject. Based on the leave-one-out-cross-validation, the CPO estimates the probability of observing in the future after having already observed . The CPO for the -th subject is defined as


where is shorthand for ,


and is the normalizing constant. The in (13) can be expressed as


Based on Hu et al. (2019), a natural Monte Carlo estimate of the LPML is given by


where , , and is a posterior sample. In real data analysis, we do not know the true resolution of spatial domain. Based on the LPML in (16), we can evaluate the performance of different resolutions. The model with larger LPML value is favored.

5 Simulation

5.1 Simulation Setup

We use simulation studies to illustrate the performance of proposed MFM-NHPP approach from multiple perspectives. The data generation process is described below, and will be followed for the rest of the section.
Step 1: Fix the number of areas & the true number of clusters .
Step 2: Generate the true clustering configuration with .
Step 3: Construct the intensity matrix ; each term in the matrix has an intensity value from . The intensity values will vary in different scenarios.
Step 4: Generate the number of points in each area independently for .

In the simulation study, two different scenarios are considered. In the first scenario, we choose three different clusters with intensities . The intensity image and one simulated data are shown in Figure 2.

Figure 2: Scenario 1 intensity image (left) and simulated data (right)

In the second scenario, we choose six different clusters with intensities . The intensity image and one simulated data are shown in Figure 3.

Figure 3: Scenario 2 intensity image (left) and simulated data (right)

The first measure we are interested from the posterior is the estimation of . The number of clusters is marginalized out in our collapsed Gibbs sampler, hence we do not directly obtain samples from the posterior distribution of . However, can still be estimated based on the posterior distribution of , the number of unique values (occupied components) in . We obtain posterior samples and obtain posterior summary measures based on samples post burn-in. Inference on the number of clusters and clustering configurations is obtained employing the modal clustering method of Dahl (2006).

The second measure used in our performance evaluation is the Rand index (Rand, 1971), which can be used to measure the accuracy of clustering. The Rand index is defined as

where and are two partitions of , and and respectively denote the number of pairs of elements of that are (a) in a same set in and a same set in , (b) in different sets in and different sets in , (c) in a same set in but in different sets in , and (d) in different sets in and a same set in . RI ranges from 0 to 1 with a higher value indicating a better agreement between the two partitions. In particular, indicates and are identical (modulo labeling of the nodes).

Another group of measures from the posterior are the estimation of intensity values for each cluster . We report two types of estimations, one is based on the posterior sample in the iteration chosen by the modal clustering method of Dahl (2006), and the other one is based on posterior mean post burn-in.

Without loss of generality, in all the simulation examples considered below, we employ Algorithm 1 with , and to fit the MFM-NHPP model; and we refer to this as the MFM-NHPP algorithm. A truncated Poisson prior with mean is assumed on . The initial number of clusters is set to , and we randomly allocate the cluster configurations in all the examples. The initial values for are from the prior distribution. We experiment with various other choices and do not find any evidence of sensitivity to the initialization.

5.2 Convergence Diagnostics

We present average value of for the first 5000 MCMC iterations from 100 randomly chosen starting configurations for the MFM-NHPP algorithm in Figure 4.

Figure 4: Average Rand Index for scenario 1 (left) and scenario 2 (right)

It can be readily seen that the Rand index rapidly converges within MCMC iterations with reasonable variations, indicating rapid mixing and convergence of the chain. We also notice that in the case where , the rand index can converge to around 0.73 while it can only converge to around 0.66 in the case where , which is consistent with the observations in Figure 5, where the estimation on the number of clusters is more accurate in the case where .

5.3 Estimation Performance

We now evaluate the performance MFM-NHPP in terms of estimating the number of clusters, accuracy of clustering (rand index) as well as the estimation of intensity values for each cluster. The two scenarios discussed in previous section are explored. In both scenarios, 100 independent datasets are generated using the steps outlined at the beginning of the section. For each independent dataset, the MFM-NHPP algorithm is run for 5000 MCMC iterations leaving out a burn-in of 2000. We report the proportion of times the true is recovered among the replicates as well as the average rand index estimation among the replicates.

Figure 5: Histograms of estimated number of clusters across 100 replicates for scenario 1 (left) and scenario 2 (right)

The summaries of estimating the number of clusters from the 100 replicates are provided in Figure 5. It can be seen from the left panel that when , our method can recover the true number of clusters in over of the replicates. From the right panel, we can also see that with large number of clusters like , the proposed method can also recover the true number of clusters in over of the replicates.

True Bias SD MSE RI 0.2 0.0114 0.0407 0.0018 0.730 10 0.7578 0.9460 1.4602 20 -1.2908 1.0683 2.7960 0.2 0.0620 0.0506 0.0064 0.662 5 -0.1296 0.3764 0.1571 20 1.0372 2.5809 7.6703 40 -2.7355 2.6416 14.3911 80 -2.4298 8.6088 79.2740 200 -32.2507 20.5086 1456.5080

Table 1: Summaries using the Dahl’s method

True intensity Bias SD MSE 0.2 0.0174 0.0277 0.0011 10 1.2567 0.6760 2.0316 20 -1.8428 0.8402 4.0948 0.2 0.0986 0.0330 0.0108 5 -0.2462 0.3398 0.1749 20 1.0081 2.4109 6.7704 40 -2.7069 2.2358 12.2762 80 -2.3562 8.1082 70.6377 200 -33.6250 19.8135 1519.2920

Table 2: Summaries using the Posterior Mean

The accuracy of clustering (rand index) as well as the estimation of intensity values for each cluster using Dahl’s method are reported in Table 1. From those results, it can be seen that intensity values for each cluster are recovered very well in the case when . When , the estimations on intensity values are accurate in most of the clusters except for the cluster with true intensity value of 200. The average random index for both scenarios is around 0.7. And from Table 2, we see that the estimations of intensity values from two summary methods are consistent in general.

5.4 Comparison to Competitors

In order to compare our methods with other methods, we use the MAE to measure the performance of different methods. Our method is compared with 7 benchmark methods (Poisson Process with linear tread of the coordinates and ; with Polynomial of order 3 in the coordinates and ; Poisson Process with Harmonic polynomial of order 2 in the coordinates and

; Poisson Process with 3 degree of freedom and 4 degree of freedom B-splines in the coordinates

and ; Strauss process with linear tread of the coordinates and ; nonparametric kernel estimation of Poisson Process) in spatstat (Baddeley et al., 2005). The boxplot of the MAE of our method and the 7 competitors in two simulation scenarios are shown in Figure 6. From those plots, we can see that our methods clearly outperform the competitors in the MAE comparison in both scenarios. The results in Figure 6 indicate that our proposed methods (summarized by Dahl’s method 4.2 and posterior mean) have better overall intensity estimation than other seven methods in both scenarios.

Figure 6: MAE Comparison for scenario 1 (left) and scenario 2 (right)

6 Analysis of USGS Data

In this section, we present a detailed analysis of USGS data based on our proposed method and try to explore the heterogeneity of the earthquake activities over the space. Based on the model in (8), we divide the spatial domain in three different pixel resolutions (, , and ). We employed Algorithm 1 with , and to fit the MFM-NHPP model; and we refer to this as the MFM-NHPP algorithm. A truncated Poisson prior with mean is assumed on . A total of 15,000 MCMC samples are saved after 5,000 burn-in. The MAE values of our methods and 7 benchmark methods as mentioned in Section 5.4 are report in Table 3.

(1) (2) (3) (4) (5) (6) (7) (8) (9) 3.86 5.32 27.9 29.7 28.5 29.7 29.8 22.1 29.3 0.593 0.832 4.85 5.05 4.92 5.05 5.06 4.10 5.04 0.182 0.265 1.28 1.30 1.29 1.31 1.31 1.14 1.31

Table 3: MAE Comparison for Real Data ((1): MFM-NHPP Posterior Mean; (2): MFM-NHPP Dahl’s Method; (3)-(9): benchmark competitors)

From the results in Table 3, we see that our proposed methods have consistently better intensity estimation than other 7 methods. Furthermore, we compared the LPML values based on (16). The LPML of three different resolutions (, , and ) are 68970, 75116, and 79148, respectively. We see the resolution with has the best estimation performance based on LPML. The estimated number of the clusters based on Dahl’s method is 8. The estimated intensities of each cluster are 0.015, 1.590, 1.614, 6.560, 16.677, 38.574, 125.863, and 374.011. The numbers of the area in each cluster are 9083, 96, 524, 196, 66, 23, 10, and 2, respectively. The estimated intensity plots based on Dahl’s method and posterior mean are shown in Figure 7.

Figure 7: Intensity Estimation for resolution given by the Dahl’s Method (left) and the posterior mean (right)

From the intensity plots and clustering results, we see that there are two areas with very high intensity for earthquakes. Nearly 90% earthquakes will occur in just 5% areas around world. In 90% region around the world, there is almost no earthquake occurrence. The approximated locations of two region belongs to highest intensity cluster is and which are near to Alaska’s central coast and Puerto Rico trench, respectively. These results are consistent with seismic zone analysis (McCann, 1985; Kelleher, 1970). The two locations are shown in Figure 8.

Figure 8: Locations with the highest intensities.

7 Discussion

In this paper, we develop a nonparametric Bayesian intensity estimation for nonhomogenous Poisson process based on Mixture of Finite Mixtures model. This statistical framework was motivated by the heterogeneity of earthquake activities over the space.

Our simulation results indicate that the proposed method can recover the heterogeneity pattern on intensity over space, and obtain better intensity estimations than other intensity estimation methods of Poisson process. Illustrated by the analysis of USGS data, our nonparametric intensity estimation has better performance than other models under Poisson assumptions by revealing the heterogeneity pattern of the earthquakes’ occurrences.

In addition, three topics beyond the scope of this paper are worth further investigation. First, we need to add spatially dependent structure on the intensity of each small areas. Second, spatial covariates are not taken into consideration. In the future, adding spatially dependent covariates is desirable for the potential improvement in intensity estimation. Building a nonparametric Bayesian model beyond the Poisson assumption is also devoted to future research.


Dr. Hu’s research was supported by Dean’s office of College of Liberal Arts and Sciences at University of Connecticut.

Appendix A Full Conditional Distributions

The full conditional distributions in Markov chain Monte Carlo (MCMC) sampling algorithm 1 in Section 4 are given as follows.

For each term in , the full conditional distribution is:


This implies that . For each term in , the full conditional distribution is:





  • Baddeley et al. (2012) Baddeley, A., Chang, Y.-M., Song, Y. and Turner, R. (2012) Nonparametric estimation of the dependence of a spatial point process on spatial covariates. Statistics and its interface, 5, 221–236.
  • Baddeley et al. (2005) Baddeley, A., Turner, R. et al. (2005) spatstat: an r package for analyzing spatial point patterns. Journal of statistical software, 12, 1–42.
  • Blackwell et al. (1973) Blackwell, D., MacQueen, J. B. et al. (1973) Ferguson distributions via Pólya urn schemes. The annals of statistics, 1, 353–355.
  • Charpentier and Durand (2015) Charpentier, A. and Durand, M. (2015) Modeling earthquake dynamics. Journal of Seismology, 19, 721–739.
  • Dahl (2006) Dahl, D. B. (2006) Model-based clustering for expression data via a Dirichlet process mixture model. Bayesian Inference for Gene Expression and Proteomics.
  • Dasgupta and Raftery (1998) Dasgupta, A. and Raftery, A. E. (1998) Detecting features in spatial point processes with clutter via model-based clustering. Journal of the American statistical Association, 93, 294–302.
  • Diggle (2013) Diggle, P. J. (2013) Statistical Analysis of Spatial and Spatio-temporal Point Patterns. CRC Press.
  • Gelfand and Dey (1994) Gelfand, A. E. and Dey, D. K. (1994) Bayesian model choice: asymptotics and exact calculations. Journal of the Royal Statistical Society. Series B (Methodological), 501–514.
  • Geng et al. (2018) Geng, J., Bhattacharya, A. and Pati, D. (2018) Probabilistic community detection with unknown number of communities. Journal of the American Statistical Association, 1–32.
  • Green (1995) Green, P. J. (1995) Reversible jump Markov chain monte carlo computation and Bayesian model determination. Biometrika, 82, 711–732.
  • Hu and Bradley (2018) Hu, G. and Bradley, J. (2018) A Bayesian spatial–temporal model with latent multivariate log-gamma random effects with application to earthquake magnitudes. Stat, 7, e179.
  • Hu et al. (2019) Hu, G., Huffer, F. and Chen, M.-H. (2019) New development of bayesian variable selection criteria for spatial point process with applications. Tech. Rep. 18-05, University of Connecticut, Department of Statistics.
  • Kelleher (1970) Kelleher, J. A. (1970) Space-time seismicity of the alaska-aleutian seismic zone. Journal of Geophysical Research, 75, 5745–5756.
  • Leininger et al. (2017)

    Leininger, T. J., Gelfand, A. E. et al. (2017) Bayesian inference and model assessment for spatial point patterns using posterior predictive samples.

    Bayesian Analysis, 12, 1–30.
  • McCann (1985) McCann, W. R. (1985) On the earthquake hazards of puerto rico and the virgin islands. Bulletin of the Seismological Society of America, 75, 251–262.
  • Miller and Harrison (2018) Miller, J. W. and Harrison, M. T. (2018) Mixture models with a prior on the number of components. Journal of the American Statistical Association, 113, 340–356.
  • Moller and Waagepetersen (2003) Moller, J. and Waagepetersen, R. P. (2003) Statistical Inference and Simulation for Spatial Point Processes. CRC Press.
  • Nas et al. (2019) Nas, M., Jalilian, A. and Bayrak, Y. (2019) Spatiotemporal comparison of declustered catalogs of earthquakes in turkey. Pure and Applied Geophysics. URL:
  • Neal (2000) Neal, R. M. (2000) Markov chain sampling methods for Dirichlet process mixture models. Journal of computational and graphical statistics, 9, 249–265.
  • Omori (1894) Omori, F. (1894) On the after-shocks of earthquakes, vol. 7. The University.
  • Pitman (1995) Pitman, J. (1995) Exchangeable and partially exchangeable random partitions. Probability theory and related fields, 102, 145–158.
  • Rand (1971) Rand, W. M. (1971) Objective criteria for the evaluation of clustering methods. Journal of the American Statistical association, 66, 846–850.
  • Rousseau and Mengersen (2011) Rousseau, J. and Mengersen, K. (2011) Asymptotic behaviour of the posterior distribution in overfitted mixture models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73, 689–710.
  • Schoenberg (2003) Schoenberg, F. P. (2003) Multidimensional residual analysis of point process models for earthquake occurrences. Journal of the American Statistical Association, 98, 789–795.
  • Shirota and Gelfand (2017) Shirota, S. and Gelfand, A. E. (2017) Approximate Bayesian computation and model assessment for repulsive spatial point processes. Journal of Computational and Graphical Statistics, 26, 646–657.
  • Teng et al. (2017) Teng, M., Nathoo, F. and Johnson, T. D. (2017) Bayesian computation for log-gaussian cox processes: a comparative analysis of methods. Journal of statistical computation and simulation, 87, 2227–2252.
  • Yang et al. (2019) Yang, H.-C., Hu, G. and Chen, M.-H. (2019) Bayesian variable selection for pareto regression models with latent multivariate log gamma process with applications to earthquake magnitudes. Geosciences, 9, 169.