Log In Sign Up

Don't Forget About Pronouns: Removing Gender Bias in Language Models Without Losing Factual Gender Information

by   Tomasz Limisiewicz, et al.

The representations in large language models contain multiple types of gender information. We focus on two types of such signals in English texts: factual gender information, which is a grammatical or semantic property, and gender bias, which is the correlation between a word and specific gender. We can disentangle the model's embeddings and identify components encoding both types of information with probing. We aim to diminish the stereotypical bias in the representations while preserving the factual gender signal. Our filtering method shows that it is possible to decrease the bias of gender-neutral profession names without significant deterioration of language modeling capabilities. The findings can be applied to language generation to mitigate reliance on stereotypes while preserving gender agreement in coreferences.


page 1

page 2

page 3

page 4


Efficient Gender Debiasing of Pre-trained Indic Language Models

The gender bias present in the data on which language models are pre-tra...

Gendered Language in Resumes and its Implications for Algorithmic Bias in Hiring

Despite growing concerns around gender bias in NLP models used in algori...

Attenuating Bias in Word Vectors

Word vector representations are well developed tools for various NLP and...

On the Unintended Social Bias of Training Language Generation Models with Data from Local Media

There are concerns that neural language models may preserve some of the ...

Multi-Dimensional Gender Bias Classification

Machine learning models are trained to find patterns in data. NLP models...

Bias in Bios: A Case Study of Semantic Representation Bias in a High-Stakes Setting

We present a large-scale study of gender bias in occupation classificati...

Intersectional Bias in Causal Language Models

To examine whether intersectional bias can be observed in language gener...