Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation

05/03/2020
by   Tianlu Wang, et al.
0

Word embeddings derived from human-generated corpora inherit strong gender bias which can be further amplified by downstream models. Some commonly adopted debiasing approaches, including the seminal Hard Debias algorithm, apply post-processing procedures that project pre-trained word embeddings into a subspace orthogonal to an inferred gender subspace. We discover that semantic-agnostic corpus regularities such as word frequency captured by the word embeddings negatively impact the performance of these algorithms. We propose a simple but effective technique, Double Hard Debias, which purifies the word embeddings against such corpus regularities prior to inferring and removing the gender subspace. Experiments on three bias mitigation benchmarks show that our approach preserves the distributional semantics of the pre-trained word embeddings while reducing gender bias to a significantly larger degree than prior approaches.

READ FULL TEXT
research
04/14/2021

[RE] Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation

Despite widespread use in natural language processing (NLP) tasks, word ...
research
09/20/2020

Exploring the Linear Subspace Hypothesis in Gender Bias Mitigation

Bolukbasi et al. (2016) presents one of the first gender bias mitigation...
research
11/20/2022

Conceptor-Aided Debiasing of Contextualized Embeddings

Pre-trained language models reflect the inherent social biases of their ...
research
08/02/2022

Gender bias in (non)-contextual clinical word embeddings for stereotypical medical categories

Clinical word embeddings are extensively used in various Bio-NLP problem...
research
09/10/2020

Investigating Gender Bias in BERT

Contextual language models (CLMs) have pushed the NLP benchmarks to a ne...
research
04/07/2020

Neutralizing Gender Bias in Word Embedding with Latent Disentanglement and Counterfactual Generation

Recent researches demonstrate that word embeddings, trained on the human...
research
06/20/2020

MDR Cluster-Debias: A Nonlinear WordEmbedding Debiasing Pipeline

Existing methods for debiasing word embeddings often do so only superfic...

Please sign up or login with your details

Forgot password? Click here to reset