Benchmarking Bayesian Improved Surname Geocoding Against Machine Learning Methods

06/26/2022
by   Ari Decter-Frain, et al.
0

Bayesian Improved Surname Geocoding (BISG) is the most popular method for proxying race/ethnicity in voter registration files that do not contain it. This paper benchmarks BISG against a range of previously untested machine learning alternatives, using voter files with self-reported race/ethnicity from California, Florida, North Carolina, and Georgia. This analysis yields three key findings. First, when given the exact same inputs, BISG and machine learning perform similarly for estimating aggregate racial/ethnic composition. Second, machine learning outperforms BISG at individual classification of race/ethnicity. Third, the performance of all methods varies substantially across states. These results suggest that pre-trained machine learning models are preferable to BISG for individual classification. Furthermore, mixed results at the precinct level and across states underscore the need for researchers to empirically validate their chosen race/ethnicity proxy in their populations of interest.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/18/2023

BISG: When inferring race or ethnicity, does it matter that people often live near their relatives?

Bayesian Improved Surname Geocoding (BISG) is a ubiquitous tool for pred...
research
07/17/2023

Can We Trust Race Prediction?

In the absence of sensitive race and ethnicity data, researchers, regula...
research
05/12/2022

Addressing Census data problems in race imputation via fully Bayesian Improved Surname Geocoding and name supplements

Prediction of an individual's race and ethnicity plays an important role...
research
08/26/2022

Race and ethnicity data for first, middle, and last names

We provide the largest compiled publicly available dictionaries of first...
research
05/17/2022

IIsy: Practical In-Network Classification

The rat race between user-generated data and data-processing systems is ...
research
05/12/2022

Characterizing patterns in police stops by race in Minneapolis from 2016-2021

The murder of George Floyd centered Minneapolis, Minnesota, in conversat...
research
03/05/2023

Estimating Racial Disparities When Race is Not Observed

The estimation of racial disparities in health care, financial services,...

Please sign up or login with your details

Forgot password? Click here to reset