Neutralizing Gender Bias in Word Embedding with Latent Disentanglement and Counterfactual Generation

04/07/2020
by   Seungjae Shin, et al.
9

Recent researches demonstrate that word embeddings, trained on the human-generated corpus, have strong gender biases in embedding spaces, and these biases can result in the prejudiced results from the downstream tasks, i.e. sentiment analysis. Whereas the previous debiasing models project word embeddings into a linear subspace, we introduce a Latent Disentangling model with a siamese auto-encoder structure and a gradient reversal layer. Our siamese auto-encoder utilizes gender word pairs to disentangle semantics and gender information of given word, and the associated gradient reversal layer provides the negative gradient to distinguish the semantics from the gender. Afterwards, we introduce a Counterfactual Generation model to modify the gender information of words, so the original and the modified embeddings can produce a gender-neutralized word embedding after geometric alignment without loss of semantic information. Experimental results quantitatively and qualitatively indicate that the introduced method is better in debiasing word embeddings, and in minimizing the semantic information losses for NLP downstream tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/25/2019

A Causal Inference Method for Reducing Gender Bias in Word Embedding Relations

Word embedding has become essential for natural language processing as i...
research
09/10/2020

Investigating Gender Bias in BERT

Contextual language models (CLMs) have pushed the NLP benchmarks to a ne...
research
12/09/2021

Word Embeddings via Causal Inference: Gender Bias Reducing and Semantic Information Preserving

With widening deployments of natural language processing (NLP) in daily ...
research
08/29/2018

Learning Gender-Neutral Word Embeddings

Word embedding models have become a fundamental component in a wide rang...
research
05/03/2020

Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation

Word embeddings derived from human-generated corpora inherit strong gend...
research
06/20/2016

Quantifying and Reducing Stereotypes in Word Embeddings

Machine learning algorithms are optimized to model statistical propertie...
research
09/13/2019

A General Framework for Implicit and Explicit Debiasing of Distributional Word Vector Spaces

Distributional word vectors have recently been shown to encode many of t...

Please sign up or login with your details

Forgot password? Click here to reset