Correcting Underrepresentation and Intersectional Bias for Fair Classification

We consider the problem of learning from data corrupted by underrepresentation bias, where positive examples are filtered from the data at different, unknown rates for a fixed number of sensitive groups. We show that with a small amount of unbiased data, we can efficiently estimate the group-wise drop-out parameters, even in settings where intersectional group membership makes learning each intersectional rate computationally infeasible. Using this estimate for the group-wise drop-out rate, we construct a re-weighting scheme that allows us to approximate the loss of any hypothesis on the true distribution, even if we only observe the empirical error on a biased sample. Finally, we present an algorithm encapsulating this learning and re-weighting process, and we provide strong PAC-style guarantees that, with high probability, our estimate of the risk of the hypothesis over the true distribution will be arbitrarily close to the true risk.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/05/2023

Group Membership Bias

When learning to rank from user interactions, search and recommendation ...
research
01/15/2019

Identifying and Correcting Label Bias in Machine Learning

Datasets often contain biases which unfairly disadvantage certain groups...
research
03/25/2019

Learning-to-Learn Stochastic Gradient Descent with Biased Regularization

We study the problem of learning-to-learn: inferring a learning algorith...
research
11/23/2018

High Dimensional Classification through ℓ_0-Penalized Empirical Risk Minimization

We consider a high dimensional binary classification problem and constru...
research
06/08/2019

Maximum Weighted Loss Discrepancy

Though machine learning algorithms excel at minimizing the average loss ...
research
06/21/2011

Learning with the Weighted Trace-norm under Arbitrary Sampling Distributions

We provide rigorous guarantees on learning with the weighted trace-norm ...
research
06/07/2018

Importance weighted generative networks

Deep generative networks can simulate from a complex target distribution...

Please sign up or login with your details

Forgot password? Click here to reset