Graph-Based Spatial Segmentation of Health-Related Areal Data
Smoothing is often used to improve the readability and interpretability of noisy areal data. However there are many instances where the underlying quantity is discontinuous. In this case, specific methods are needed to estimate the piecewise constant spatial process. A well-known approach in this setting is to perform segmentation of the signal using the adjacency graph, as does the graph-based fused lasso. But this method does not scale well to large graphs. This article introduces a new method for piecewise-constant spatial estimation that (i) is fast to compute on large graphs and (ii) yields sparser models than the fused lasso (for the same amount of regularization), giving estimates that are easier to interpret. We illustrate our method on simulated data and apply it to real data on overweight prevalence in the Netherlands. Healthy and unhealthy zones are identified which cannot be explained by demographic of socio-economic characteristics. We find that our method is capable of identifying such zones and can assist policy makers with their health-improving strategies. The implementation of our method in R is publicly available at github.com/goepp/graphseg.
READ FULL TEXT