A Generalization of Ripley's K Function for the Detection of Spatial Clustering in Areal Data

by   Stella Self, et al.

Spatial clustering detection has a variety of applications in diverse fields, including identifying infectious disease outbreaks, assessing land use patterns, pinpointing crime hotspots, and identifying clusters of neurons in brain imaging applications. While performing spatial clustering analysis on point process data is common, applications to areal data are frequently of interest. For example, researchers might wish to know if census tracts with a case of a rare medical condition or an outbreak of an infectious disease tend to cluster together spatially. Since few spatial clustering methods are designed for areal data, researchers often reduce the areal data to point process data (e.g., using the centroid of each areal unit) and apply methods designed for point process data, such as Ripley's K function or the average nearest neighbor method. However, since these methods were not designed for areal data, a number of issues can arise. For example, we show that they can result in loss of power and/or a significantly inflated type I error rate. To address these issues, we propose a generalization of Ripley's K function designed specifically to detect spatial clustering in areal data. We compare its performance to that of the traditional Ripley's K function, the average nearest neighbor method, and the spatial scan statistic with an extensive simulation study. We then evaluate the real world performance of the method by using it to detect spatial clustering in land parcels containing conservation easements and US counties with high pediatric overweight/obesity rates.


Uncertain Neighbors: Bayesian Propensity Score Matching For Causal Inference

We compare the performance of standard nearest-neighbor propensity score...

Nearest Neighbor Median Shift Clustering for Binary Data

We describe in this paper the theory and practice behind a new modal clu...

A Bayesian shared-frailty spatial scan statistic model for time-to-event data

Spatial scan statistics are well known and widely used methods for the d...

Nearest-Neighbor Neural Networks for Geostatistics

Kriging is the predominant method used for spatial prediction, but relie...

Spatial Autoregressive Models for Scan Statistic

Spatial scan statistics are well-known methods for cluster detection and...

STICC: A multivariate spatial clustering method for repeated geographic pattern discovery with consideration of spatial contiguity

Spatial clustering has been widely used for spatial data mining and know...

Please sign up or login with your details

Forgot password? Click here to reset