Quantifying and Reducing Stereotypes in Word Embeddings

06/20/2016
by   Tolga Bolukbasi, et al.
0

Machine learning algorithms are optimized to model statistical properties of the training data. If the input data reflects stereotypes and biases of the broader society, then the output of the learning algorithm also captures these stereotypes. In this paper, we initiate the study of gender stereotypes in word embedding, a popular framework to represent text data. As their use becomes increasingly common, applications can inadvertently amplify unwanted stereotypes. We show across multiple datasets that the embeddings contain significant gender stereotypes, especially with regard to professions. We created a novel gender analogy task and combined it with crowdsourcing to systematically quantify the gender bias in a given embedding. We developed an efficient algorithm that reduces gender stereotype using just a handful of training examples while preserving the useful geometric properties of the embedding. We evaluated our algorithm on several metrics. While we focus on male/female stereotypes, our framework may be applicable to other types of embedding biases.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/21/2016

Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings

The blind application of machine learning runs the risk of amplifying bi...
research
03/09/2019

Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them

Word embeddings are widely used in NLP for a vast range of tasks. It was...
research
06/03/2022

Measuring Gender Bias in Word Embeddings of Gendered Languages Requires Disentangling Grammatical Gender Signals

Does the grammatical gender of a language interfere when measuring the s...
research
08/29/2018

Learning Gender-Neutral Word Embeddings

Word embedding models have become a fundamental component in a wide rang...
research
04/07/2020

Neutralizing Gender Bias in Word Embedding with Latent Disentanglement and Counterfactual Generation

Recent researches demonstrate that word embeddings, trained on the human...
research
02/01/2019

Examining the Presence of Gender Bias in Customer Reviews Using Word Embedding

Humans have entered the age of algorithms. Each minute, algorithms shape...
research
09/02/2020

Gender Stereotype Reinforcement: Measuring the Gender Bias Conveyed by Ranking Algorithms

Search Engines (SE) have been shown to perpetuate well-known gender ster...

Please sign up or login with your details

Forgot password? Click here to reset