Model Mis-specification and Algorithmic Bias

05/31/2021
by   Runshan Fu, et al.
0

Machine learning algorithms are increasingly used to inform critical decisions. There is a growing concern about bias, that algorithms may produce uneven outcomes for individuals in different demographic groups. In this work, we measure bias as the difference between mean prediction errors across groups. We show that even with unbiased input data, when a model is mis-specified: (1) population-level mean prediction error can still be negligible, but group-level mean prediction errors can be large; (2) such errors are not equal across groups; and (3) the difference between errors, i.e., bias, can take the worst-case realization. That is, when there are two groups of the same size, mean prediction errors for these two groups have the same magnitude but opposite signs. In closed form, we show such errors and bias are functions of the first and second moments of the joint distribution of features (for linear and probit regressions). We also conduct numerical experiments to show similar results in more general settings. Our work provides a first step for decoupling the impact of different causes of bias.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/19/2021

Resolving the Disparate Impact of Uncertainty: Affirmative Action vs. Affirmative Information

Algorithmic risk assessments hold the promise of greatly advancing accur...
research
02/18/2020

Fair Prediction with Endogenous Behavior

There is increasing regulatory interest in whether machine learning algo...
research
01/10/2023

Inside the Black Box: Detecting and Mitigating Algorithmic Bias across Racialized Groups in College Student-Success Prediction

Colleges and universities are increasingly turning to algorithms that pr...
research
03/31/2020

A survey of bias in Machine Learning through the prism of Statistical Parity for the Adult Data Set

Applications based on Machine Learning models have now become an indispe...
research
01/24/2020

Privacy for All: Demystify Vulnerability Disparity of Differential Privacy against Membership Inference Attack

Machine learning algorithms, when applied to sensitive data, pose a pote...
research
11/22/2019

Noise Induces Loss Discrepancy Across Groups for Linear Regression

We study the effect of feature noise (measurement error) on the discrepa...
research
07/02/2020

Random errors are not politically neutral

Errors are inevitable in the implementation of any complex process. Here...

Please sign up or login with your details

Forgot password? Click here to reset