Creating Compact Regions of Social Determinants of Health

09/23/2022
by   Barrett Lattimer, et al.
0

Regionalization is the act of breaking a dataset into contiguous homogeneous regions that are heterogeneous from each other. Many different algorithms exist for performing regionalization; however, using these algorithms on large real world data sets have only become feasible in terms of compute power in recent years. Very few studies have been done comparing different regionalization methods, and those that do lack analysis in memory, scalability, geographic metrics, and large-scale real-world applications. This study compares state-of-the-art regionalization methods, namely, Agglomerative Clustering, SKATER, REDCAP, AZP, and Max-P-Regions using real world social determinant of health (SDOH) data. The scale of real world SDOH data, up to 1 million data points in this study, not only compares the algorithms over different data sets but provides a stress test for each individual regionalization algorithm, most of which have never been run on such scales previously. We use several new geographic metrics to compare algorithms as well as perform a comparative memory analysis. The prevailing regionalization method is then compared with unconstrained K-Means clustering on their ability to separate real health data in Virginia and Washington DC.

READ FULL TEXT

page 4

page 15

page 17

page 18

research
05/25/2021

Scaling Hierarchical Agglomerative Clustering to Billion-sized Datasets

Hierarchical Agglomerative Clustering (HAC) is one of the oldest but sti...
research
02/21/2018

Scalable and Robust Sparse Subspace Clustering Using Randomized Clustering and Multilayer Graphs

Sparse subspace clustering (SSC) is one of the current state-of-the-art ...
research
01/04/2018

ICFVR 2017: 3rd International Competition on Finger Vein Recognition

In recent years, finger vein recognition has become an important sub-fie...
research
10/04/2020

Test-Cost Sensitive Methods for Identifying Nearby Points

Real-world applications that involve missing values are often constraine...
research
07/03/2022

An Empirical Evaluation of k-Means Coresets

Coresets are among the most popular paradigms for summarizing data. In p...
research
02/21/2018

A Guide to Comparing the Performance of VA Algorithms

The literature comparing the performance of algorithms for assigning cau...
research
08/03/2016

Empirical Evaluation of Real World Tournaments

Computational Social Choice (ComSoc) is a rapidly developing field at th...

Please sign up or login with your details

Forgot password? Click here to reset