Scalable model selection for spatial additive mixed modeling: application to crime analysis

08/08/2020
by   Daisuke Murakami, et al.
0

A rapid growth in spatial open datasets has led to a huge demand for regression approaches accommodating spatial and non-spatial effects in big data. Regression model selection is particularly important to stably estimate flexible regression models. However, conventional methods can be slow for large samples. Hence, we develop a fast and practical model-selection approach for spatial regression models, focusing on the selection of coefficient types that include constant, spatially varying, and non-spatially varying coefficients. A pre-processing approach, which replaces data matrices with small inner products through dimension reduction dramatically accelerates the computation speed of model selection. Numerical experiments show that our approach selects the true model accurately and computationally efficiently, highlighting the importance of model selection in the spatial regression context. Then, the present approach is applied to open data to investigate local factors affecting crime in Japan. The results suggest that our approach is useful not only for extracting effective crime factors but also for predicting crime events. This scalable model selection will be key to appropriately specifying flexible and large-scale spatial regression models in the era of big data. The developed model selection approach was implemented in the R package spmoran.

READ FULL TEXT
research
06/04/2020

Model selection criteria for regression models with splines and the automatic localization of knots

In this paper we propose a model selection approach to fit a regression ...
research
02/16/2022

Automated surface feature selection using SALSA2D: An illustration using Elephant Mortality data in Etosha National Park

This analysis is motivated by the MIKE dataset in Etosha National Park (...
research
08/25/2023

GeoExplainer: A Visual Analytics Framework for Spatial Modeling Contextualization and Report Generation

Geographic regression models of various descriptions are often applied t...
research
05/25/2022

Factorized Structured Regression for Large-Scale Varying Coefficient Models

Recommender Systems (RS) pervade many aspects of our everyday digital li...
research
05/02/2023

On the selection of optimal subdata for big data regression based on leverage scores

Regression can be really difficult in case of big datasets, since we hav...
research
05/20/2020

Balancing spatial and non-spatial variation in varying coefficient modeling: a remedy for spurious correlation

This study discusses the importance of balancing spatial and non-spatial...
research
06/22/2022

Optimally Weighted Ensembles of Regression Models: Exact Weight Optimization and Applications

Automated model selection is often proposed to users to choose which mac...

Please sign up or login with your details

Forgot password? Click here to reset