Language Models Get a Gender Makeover: Mitigating Gender Bias with Few-Shot Data Interventions

06/07/2023
by   Himanshu Thakur, et al.
0

Societal biases present in pre-trained large language models are a critical issue as these models have been shown to propagate biases in countless downstream applications, rendering them unfair towards specific groups of people. Since large-scale retraining of these models from scratch is both time and compute-expensive, a variety of approaches have been previously proposed that de-bias a pre-trained model. While the majority of current state-of-the-art debiasing methods focus on changes to the training regime, in this paper, we propose data intervention strategies as a powerful yet simple technique to reduce gender bias in pre-trained models. Specifically, we empirically show that by fine-tuning a pre-trained model on only 10 de-biased (intervened) training examples, the tendency to favor any gender is significantly reduced. Since our proposed method only needs a few training examples, our few-shot debiasing approach is highly feasible and practical. Through extensive experimentation, we show that our debiasing technique performs better than competitive state-of-the-art baselines with minimal loss in language modeling ability.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/20/2023

Gender-tuning: Empowering Fine-tuning for Debiasing Pre-trained Language Models

Recent studies have revealed that the widely-used Pre-trained Language M...
research
10/11/2021

Improving Gender Fairness of Pre-Trained Language Models without Catastrophic Forgetting

Although pre-trained language models, such as BERT, achieve state-of-art...
research
08/02/2022

Debiasing Gender Bias in Information Retrieval Models

Biases in culture, gender, ethnicity, etc. have existed for decades and ...
research
10/03/2021

Adversarial Examples Generation for Reducing Implicit Gender Bias in Pre-trained Models

Over the last few years, Contextualized Pre-trained Neural Language Mode...
research
07/06/2022

Gender Biases and Where to Find Them: Exploring Gender Bias in Pre-Trained Transformer-based Language Models Using Movement Pruning

Language model debiasing has emerged as an important field of study in t...
research
01/21/2023

Blacks is to Anger as Whites is to Joy? Understanding Latent Affective Bias in Large Pre-trained Neural Language Models

Groundbreaking inventions and highly significant performance improvement...
research
07/19/2023

Improving Pre-trained Language Models' Generalization

The reusability of state-of-the-art Pre-trained Language Models (PLMs) i...

Please sign up or login with your details

Forgot password? Click here to reset