Controlling Bias Exposure for Fair Interpretable Predictions

10/14/2022
by   Zexue He, et al.
0

Recent work on reducing bias in NLP models usually focuses on protecting or isolating information related to a sensitive attribute (like gender or race). However, when sensitive information is semantically entangled with the task information of the input, e.g., the gender information is predictive for a profession, a fair trade-off between task performance and bias mitigation is difficult to achieve. Existing approaches perform this trade-off by eliminating bias information from the latent space, lacking control over how much bias is necessarily required to be removed. We argue that a favorable debiasing method should use sensitive information 'fairly' rather than blindly eliminating it (Caliskan et al., 2017; Sun et al., 2019). In this work, we provide a novel debiasing algorithm by adjusting the predictive model's belief to (1) ignore the sensitive information if it is not useful for the task; (2) use sensitive information minimally as necessary for the prediction (while also incurring a penalty). Experimental results on two text classification tasks (influenced by gender) and an open-ended generation task (influenced by race) indicate that our model achieves a desirable trade-off between debiasing and task performance along with producing debiased rationales as evidence.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/14/2022

InterFair: Debiasing with Natural Language Feedback for Fair Interpretable Predictions

Debiasing methods in NLP models traditionally focus on isolating informa...
research
08/07/2019

Debiasing Embeddings for Reduced Gender Bias in Text Classification

(Bolukbasi et al., 2016) demonstrated that pretrained word embeddings ca...
research
05/16/2020

Towards classification parity across cohorts

Recently, there has been a lot of interest in ensuring algorithmic fairn...
research
01/29/2019

Implicit Diversity in Image Summarization

Case studies, such as Kay et al., 2015 have shown that in image summariz...
research
10/09/2020

CryptoCredit: Securely Training Fair Models

When developing models for regulated decision making, sensitive features...
research
04/12/2022

Easy Adaptation to Mitigate Gender Bias in Multilingual Text Classification

Existing approaches to mitigate demographic biases evaluate on monolingu...
research
09/21/2021

Evaluating Debiasing Techniques for Intersectional Biases

Bias is pervasive in NLP models, motivating the development of automatic...

Please sign up or login with your details

Forgot password? Click here to reset