A Source-Criticism Debiasing Method for GloVe Embeddings

06/25/2021
by   Hope McGovern, et al.
0

It is well-documented that word embeddings trained on large public corpora consistently exhibit known human social biases. Although many methods for debiasing exist, almost all fixate on completely eliminating biased information from the embeddings and often diminish training set size in the process. In this paper, we present a simple yet effective method for debiasing GloVe word embeddings (Pennington et al., 2014) which works by incorporating explicit information about training set bias rather than removing biased data outright. Our method runs quickly and efficiently with the help of a fast bias gradient approximation method from Brunet et al. (2019). As our approach is akin to the notion of 'source criticism' in the humanities, we term our method Source-Critical GloVe (SC-GloVe). We show that SC-GloVe reduces the effect size on Word Embedding Association Test (WEAT) sets without sacrificing training data or TOP-1 performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/14/2019

Conceptor Debiasing of Word Representations Evaluated on WEAT

Bias in word embeddings such as Word2Vec has been widely investigated, a...
research
10/30/2020

"Thy algorithm shalt not bear false witness": An Evaluation of Multiclass Debiasing Methods on Word Embeddings

With the vast development and employment of artificial intelligence appl...
research
04/03/2019

Black is to Criminal as Caucasian is to Police:Detecting and Removing Multiclass Bias in Word Embeddings

Online texts -- across genres, registers, domains, and styles -- are rid...
research
05/23/2023

Detecting and Mitigating Indirect Stereotypes in Word Embeddings

Societal biases in the usage of words, including harmful stereotypes, ar...
research
08/18/2019

Understanding Undesirable Word Embedding Associations

Word embeddings are often criticized for capturing undesirable word asso...
research
09/09/2022

Fast and Accurate Importance Weighting for Correcting Sample Bias

Bias in datasets can be very detrimental for appropriate statistical est...
research
11/14/2018

A Deterministic Algorithm for Bridging Anaphora Resolution

Previous work on bridging anaphora resolution (Poesio et al., 2004; Hou ...

Please sign up or login with your details

Forgot password? Click here to reset