Measuring Gender Bias in Word Embeddings of Gendered Languages Requires Disentangling Grammatical Gender Signals

06/03/2022
by   Shiva Omrani Sabbaghi, et al.
0

Does the grammatical gender of a language interfere when measuring the semantic gender information captured by its word embeddings? A number of anomalous gender bias measurements in the embeddings of gendered languages suggest this possibility. We demonstrate that word embeddings learn the association between a noun and its grammatical gender in grammatically gendered languages, which can skew social gender bias measurements. Consequently, word embedding post-processing methods are introduced to quantify, disentangle, and evaluate grammatical gender signals. The evaluation is performed on five gendered languages from the Germanic, Romance, and Slavic branches of the Indo-European language family. Our method reduces the strength of grammatical gender signals, which is measured in terms of effect size (Cohen's d), by a significant average of d = 1.3 for French, German, and Italian, and d = 0.56 for Polish and Spanish. Once grammatical gender is disentangled, the association between over 90 grammatical gender weakens, and cross-lingual bias results from the Word Embedding Association Test (WEAT) become more congruent with country-level implicit bias measurements. The results further suggest that disentangling grammatical gender signals from word embeddings may lead to improvement in semantic machine learning tasks.

READ FULL TEXT
research
09/05/2019

Examining Gender Bias in Languages with Grammatical Gender

Recent studies have shown that word embeddings exhibit gender bias inher...
research
10/30/2019

How does Grammatical Gender Affect Noun Representations in Gender-Marking Languages?

Many natural languages assign grammatical gender also to inanimate nouns...
research
08/18/2019

Understanding Undesirable Word Embedding Associations

Word embeddings are often criticized for capturing undesirable word asso...
research
12/13/2018

An Unbiased Approach to Quantification of Gender Inclination using Interpretable Word Representations

Recent advances in word embedding provide significant benefit to various...
research
05/01/2020

Predicting Declension Class from Form and Meaning

The noun lexica of many natural languages are divided into several decle...
research
01/02/2023

The Undesirable Dependence on Frequency of Gender Bias Metrics Based on Word Embeddings

Numerous works use word embedding-based metrics to quantify societal bia...
research
06/20/2016

Quantifying and Reducing Stereotypes in Word Embeddings

Machine learning algorithms are optimized to model statistical propertie...

Please sign up or login with your details

Forgot password? Click here to reset