Sustainable Modular Debiasing of Language Models

09/08/2021
by   Anne Lauscher, et al.
0

Unfair stereotypical biases (e.g., gender, racial, or religious biases) encoded in modern pretrained language models (PLMs) have negative ethical implications for widespread adoption of state-of-the-art language technology. To remedy for this, a wide range of debiasing techniques have recently been introduced to remove such stereotypical biases from PLMs. Existing debiasing methods, however, directly modify all of the PLMs parameters, which – besides being computationally expensive – comes with the inherent risk of (catastrophic) forgetting of useful language knowledge acquired in pretraining. In this work, we propose a more sustainable modular debiasing approach based on dedicated debiasing adapters, dubbed ADELE. Concretely, we (1) inject adapter modules into the original PLM layers and (2) update only the adapters (i.e., we keep the original PLM parameters frozen) via language modeling training on a counterfactually augmented corpus. We showcase ADELE, in gender debiasing of BERT: our extensive evaluation, encompassing three intrinsic and two extrinsic bias measures, renders ADELE, very effective in bias mitigation. We further show that – due to its modular nature – ADELE, coupled with task adapters, retains fairness even after large-scale downstream training. Finally, by means of multilingual BERT, we successfully transfer ADELE, to six target languages.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/06/2023

An Empirical Analysis of Parameter-Efficient Methods for Debiasing Pre-Trained Language Models

The increasingly large size of modern pretrained language models not onl...
research
10/11/2021

Improving Gender Fairness of Pre-Trained Language Models without Catastrophic Forgetting

Although pre-trained language models, such as BERT, achieve state-of-art...
research
06/23/2022

Towards WinoQueer: Developing a Benchmark for Anti-Queer Bias in Large Language Models

This paper presents exploratory work on whether and to what extent biase...
research
07/21/2022

The Birth of Bias: A case study on the evolution of gender bias in an English language model

Detecting and mitigating harmful biases in modern language models are wi...
research
02/13/2023

Parameter-efficient Modularised Bias Mitigation via AdapterFusion

Large pre-trained language models contain societal biases and carry alon...
research
06/27/2023

Gender Bias in BERT – Measuring and Analysing Biases through Sentiment Rating in a Realistic Downstream Classification Task

Pretrained language models are publicly available and constantly finetun...
research
05/24/2020

Common Sense or World Knowledge? Investigating Adapter-Based Knowledge Injection into Pretrained Transformers

Following the major success of neural language models (LMs) such as BERT...

Please sign up or login with your details

Forgot password? Click here to reset