Fairness via Representation Neutralization

06/23/2021
by   Mengnan Du, et al.
5

Existing bias mitigation methods for DNN models primarily work on learning debiased encoders. This process not only requires a lot of instance-level annotations for sensitive attributes, it also does not guarantee that all fairness sensitive information has been removed from the encoder. To address these limitations, we explore the following research question: Can we reduce the discrimination of DNN models by only debiasing the classification head, even with biased representations as inputs? To this end, we propose a new mitigation technique, namely, Representation Neutralization for Fairness (RNF) that achieves fairness by debiasing only the task-specific classification head of DNN models. To this end, we leverage samples with the same ground-truth label but different sensitive attributes, and use their neutralized representations to train the classification head of the DNN model. The key idea of RNF is to discourage the classification head from capturing spurious correlation between fairness sensitive information in encoder representations with specific class labels. To address low-resource settings with no access to sensitive attribute annotations, we leverage a bias-amplified model to generate proxy annotations for sensitive attributes. Experimental results over several benchmark datasets demonstrate our RNF framework to effectively reduce discrimination of DNN models with minimal degradation in task-specific performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/02/2022

Fair Visual Recognition via Intervention with Proxy Features

Deep learning models often learn to make predictions that rely on sensit...
research
07/20/2022

Mitigating Algorithmic Bias with Limited Annotations

Existing work on fairness modeling commonly assumes that sensitive attri...
research
08/13/2023

Benign Shortcut for Debiasing: Fair Visual Recognition via Intervention with Shortcut Features

Machine learning models often learn to make predictions that rely on sen...
research
10/12/2022

Fairness via Adversarial Attribute Neighbourhood Robust Learning

Improving fairness between privileged and less-privileged sensitive attr...
research
08/01/2022

De-biased Representation Learning for Fairness with Unreliable Labels

Removing bias while keeping all task-relevant information is challenging...
research
05/20/2023

Model Debiasing via Gradient-based Explanation on Representation

Machine learning systems produce biased results towards certain demograp...
research
07/07/2022

Enhancing Fairness of Visual Attribute Predictors

The performance of deep neural networks for image recognition tasks such...

Please sign up or login with your details

Forgot password? Click here to reset