Quantifying Gender Bias Towards Politicians in Cross-Lingual Language Models

04/15/2021
by   Karolina Stańczak, et al.
0

While the prevalence of large pre-trained language models has led to significant improvements in the performance of NLP systems, recent research has demonstrated that these models inherit societal biases extant in natural language. In this paper, we explore a simple method to probe pre-trained language models for gender bias, which we use to effect a multi-lingual study of gender bias towards politicians. We construct a dataset of 250k politicians from most countries in the world and quantify adjective and verb usage around those politicians' names as a function of their gender. We conduct our study in 7 languages across 6 different language modeling architectures. Our results demonstrate that stance towards politicians in pre-trained language models is highly dependent on the language used. Finally, contrary to previous findings, our study suggests that larger language models do not tend to be significantly more gender-biased than smaller ones.

READ FULL TEXT

page 1

page 2

page 3

page 4

09/08/2022

Efficient Gender Debiasing of Pre-trained Indic Language Models

The gender bias present in the data on which language models are pre-tra...
04/18/2021

Worst of Both Worlds: Biases Compound in Pre-trained Vision-and-Language Models

Numerous works have analyzed biases in vision and pre-trained language m...
07/10/2022

FairDistillation: Mitigating Stereotyping in Language Models

Large pre-trained language models are successfully being used in a varie...
03/26/2022

Metaphors in Pre-Trained Language Models: Probing and Generalization Across Datasets and Languages

Human languages are full of metaphorical expressions. Metaphors help peo...
08/08/2022

Debiased Large Language Models Still Associate Muslims with Uniquely Violent Acts

Recent work demonstrates a bias in the GPT-3 model towards generating vi...
10/16/2021

An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-Trained Language Models

Recent work has shown that pre-trained language models capture social bi...