Correcting Sociodemographic Selection Biases for Accurate Population Prediction from Social Media

11/10/2019
by   Salvatore Giorgi, et al.
0

Social media is increasingly used for large-scale population predictions, such as estimating community health statistics. However, social media users are not typically a representative sample of the intended population — a "selection bias". Across five tasks for predicting US county population health statistics from Twitter, we explore standard restratification techniques — bias mitigation approaches that reweight people-specific variables according to how under-sampled their socio-demographic groups are. We found standard restratification provided no improvement and often degraded population prediction accuracy. The core reason for this seemed to be both shrunken and sparse estimates of each population's socio-demographics for which we thus develop and evaluate three methods to address: predictive redistribution to account for shrinking, as well as adaptive binning and informed smoothing to handle sparse socio-demographic estimates. We show each of our methods can significantly improve over the standard restratification approaches. Combining approaches, we find substantial improvements over non-restratified models as well, yielding a 35.4 life satisfaction, and an 10.0

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/15/2019

Demographic Inference and Representative Population Estimates from Multilingual Social Media Data

Social media provide access to behavioural data at an unprecedented scal...
research
08/28/2018

Residualized Factor Adaptation for Community Social Media Prediction Tasks

Predictive models over social media language have shown promise in captu...
research
08/05/2019

Animal Wildlife Population Estimation Using Social Media Images Collections

We are losing biodiversity at an unprecedented scale and in many cases, ...
research
08/29/2018

The Remarkable Benefit of User-Level Aggregation for Lexical-based Population-Level Predictions

Nowcasting based on social media text promises to provide unobtrusive an...
research
11/10/2019

Social Bias Frames: Reasoning about Social and Power Implications of Language

Language has the power to reinforce stereotypes and project social biase...
research
01/23/2019

Evaluation of Biases in Self-reported Demographic and Psychometric Information: Traditional versus Facebook-based Surveys

Social media in scientific research offer a unique digital observatory o...
research
09/12/2023

Artificially Intelligent Opinion Polling

We seek to democratise public-opinion research by providing practitioner...

Please sign up or login with your details

Forgot password? Click here to reset