Identifying and Correcting Label Bias in Machine Learning

01/15/2019
by   Heinrich Jiang, et al.
0

Datasets often contain biases which unfairly disadvantage certain groups, and classifiers trained on such datasets can inherit these biases. In this paper, we provide a mathematical formulation of how this bias can arise. We do so by assuming the existence of underlying, unknown, and unbiased labels which are overwritten by an agent who intends to provide accurate labels but may have biases against certain groups. Despite the fact that we only observe the biased labels, we are able to show that the bias may nevertheless be corrected by re-weighting the data points without changing the labels. We show, with theoretical guarantees, that training on the re-weighted dataset corresponds to training on the unobserved but unbiased labels, thus leading to an unbiased machine learning classifier. Our procedure is fast and robust and can be used with virtually any learning algorithm. We evaluate on a number of standard machine learning fairness datasets and a variety of fairness notions, finding that our method outperforms standard approaches in achieving fair classification.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/07/2018

Residual Unfairness in Fair Machine Learning from Prejudiced Data

Recent work in fairness in machine learning has proposed adjusting for f...
research
09/18/2020

Group Fairness by Probabilistic Modeling with Latent Fair Decisions

Machine learning systems are increasingly being used to make impactful d...
research
06/19/2023

Correcting Underrepresentation and Intersectional Bias for Fair Classification

We consider the problem of learning from data corrupted by underrepresen...
research
02/13/2023

Provable Detection of Propagating Sampling Bias in Prediction Models

With an increased focus on incorporating fairness in machine learning mo...
research
05/31/2023

Signal Is Harder To Learn Than Bias: Debiasing with Focal Loss

Spurious correlations are everywhere. While humans often do not perceive...
research
09/09/2022

Fast and Accurate Importance Weighting for Correcting Sample Bias

Bias in datasets can be very detrimental for appropriate statistical est...
research
02/16/2022

Fairness constraint in Structural Econometrics and Application to fair estimation using Instrumental Variables

A supervised machine learning algorithm determines a model from a learni...

Please sign up or login with your details

Forgot password? Click here to reset