Improving Gender Fairness of Pre-Trained Language Models without Catastrophic Forgetting

10/11/2021
by   Zahra Fatemi, et al.
12

Although pre-trained language models, such as BERT, achieve state-of-art performance in many language understanding tasks, they have been demonstrated to inherit strong gender bias from its training data. Existing studies addressing the gender bias issue of pre-trained models, usually recollect and build gender-neutral data on their own and conduct a second phase pre-training on the released pre-trained model with such data. However, given the limited size of the gender-neutral data and its potential distributional mismatch with the original pre-training data, catastrophic forgetting would occur during the second-phase pre-training. Forgetting on the original training data may damage the model's downstream performance to a large margin. In this work, we first empirically show that even if the gender-neutral data for second-phase pre-training comes from the original training data, catastrophic forgetting still occurs if the size of gender-neutral data is smaller than that of original training data. Then, we propose a new method, GEnder Equality Prompt (GEEP), to improve gender fairness of pre-trained models without forgetting. GEEP learns gender-related prompts to reduce gender bias, conditioned on frozen language models. Since all pre-trained parameters are frozen, forgetting on information from the original training data can be alleviated to the most extent. Then GEEP trains new embeddings of profession names as gender equality prompts conditioned on the frozen model. Empirical results show that GEEP not only achieves state-of-the-art performances on gender debiasing in various applications such as pronoun predicting and coreference resolution, but also achieves comparable results on general downstream tasks such as GLUE with original pre-trained models without much forgetting.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/07/2023

Language Models Get a Gender Makeover: Mitigating Gender Bias with Few-Shot Data Interventions

Societal biases present in pre-trained large language models are a criti...
research
04/04/2023

Improved Visual Fine-tuning with Natural Language Supervision

Fine-tuning a pre-trained model can leverage the semantic information fr...
research
12/05/2022

INCLUSIFY: A benchmark and a model for gender-inclusive German

Gender-inclusive language is important for achieving gender equality in ...
research
09/08/2021

Sustainable Modular Debiasing of Language Models

Unfair stereotypical biases (e.g., gender, racial, or religious biases) ...
research
11/21/2022

Measuring Harmful Representations in Scandinavian Language Models

Scandinavian countries are perceived as role-models when it comes to gen...
research
02/13/2023

Parameter-efficient Modularised Bias Mitigation via AdapterFusion

Large pre-trained language models contain societal biases and carry alon...
research
10/03/2021

Adversarial Examples Generation for Reducing Implicit Gender Bias in Pre-trained Models

Over the last few years, Contextualized Pre-trained Neural Language Mode...

Please sign up or login with your details

Forgot password? Click here to reset