DeepAI AI Chat
Log In Sign Up

FairDistillation: Mitigating Stereotyping in Language Models

by   Pieter Delobelle, et al.

Large pre-trained language models are successfully being used in a variety of tasks, across many languages. With this ever-increasing usage, the risk of harmful side effects also rises, for example by reproducing and reinforcing stereotypes. However, detecting and mitigating these harms is difficult to do in general and becomes computationally expensive when tackling multiple languages or when considering different biases. To address this, we present FairDistillation: a cross-lingual method based on knowledge distillation to construct smaller language models while controlling for specific biases. We found that our distillation method does not negatively affect the downstream performance on most tasks and successfully mitigates stereotyping and representational harms. We demonstrate that FairDistillation can create fairer language models at a considerably lower cost than alternative approaches.


page 1

page 2

page 3

page 4


Quantifying Gender Bias Towards Politicians in Cross-Lingual Language Models

While the prevalence of large pre-trained language models has led to sig...

Do Multi-Lingual Pre-trained Language Models Reveal Consistent Token Attributions in Different Languages?

During the past several years, a surge of multi-lingual Pre-trained Lang...

Understanding Cross-Lingual Syntactic Transfer in Multilingual Recurrent Neural Networks

It is now established that modern neural language models can be successf...

Discovering Language-neutral Sub-networks in Multilingual Language Models

Multilingual pre-trained language models perform remarkably well on cros...

Metaphors in Pre-Trained Language Models: Probing and Generalization Across Datasets and Languages

Human languages are full of metaphorical expressions. Metaphors help peo...

Scalable Syntax-Aware Language Models Using Knowledge Distillation

Prior work has shown that, on small amounts of training data, syntactic ...