Flexible Basis Representations for Modeling High-Dimensional Hierarchical Spatial Data
Nonstationary and non-Gaussian spatial data are prevalent across many fields (e.g., counts of animal species, disease incidences in susceptible regions, and remotely-sensed satellite imagery). Due to modern data collection methods, the size of these datasets have grown considerably. Spatial generalized linear mixed models (SGLMMs) are a flexible class of models used to model nonstationary and non-Gaussian datasets. Despite their utility, SGLMMs can be computationally prohibitive for even moderately large datasets. To circumvent this issue, past studies have embedded nested radial basis function into the SGLMM. However, two crucial specifications (knot locations and bandwidths), which directly affect model performance, are generally fixed prior to model-fitting. We propose a novel algorithm to model large nonstationary and non-Gaussian spatial datasets using adaptive radial basis functions. Our approach: (1) partitions the spatial domain into subregions; (2) selects a carefully curated set of basis knot locations within each partition; and (3) models the latent spatial surface using partition-varying and data-driven (adaptive) basis functions. Through an extensive simulation study, we show that our approach provides more accurate predictions than a competing method while preserving computational efficiency. We also demonstrate our approach on two environmental datasets that feature incidences of a parasitic plant species and counts of bird species in the United States. Our method generalizes to other hierarchical spatial models, and we provide ready-to-use code written in nimble
READ FULL TEXT