Impact of Gender Debiased Word Embeddings in Language Modeling

05/03/2021
by   Christine Basta, et al.
0

Gender, race and social biases have recently been detected as evident examples of unfairness in applications of Natural Language Processing. A key path towards fairness is to understand, analyse and interpret our data and algorithms. Recent studies have shown that the human-generated data used in training is an apparent factor of getting biases. In addition, current algorithms have also been proven to amplify biases from data. To further address these concerns, in this paper, we study how an state-of-the-art recurrent neural language model behaves when trained on data, which under-represents females, using pre-trained standard and debiased word embeddings. Results show that language models inherit higher bias when trained on unbalanced data when using pre-trained embeddings, in comparison with using embeddings trained within the task. Moreover, results show that, on the same data, language models inherit lower bias when using debiased pre-trained emdeddings, compared to using standard pre-trained embeddings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/03/2019

Gender-preserving Debiasing for Pre-trained Word Embeddings

Word embeddings learnt from massive text collections have demonstrated s...
research
10/12/2016

Language Models with Pre-Trained (GloVe) Word Embeddings

In this work we implement a training of a Language Model (LM), using Rec...
research
10/16/2021

An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-Trained Language Models

Recent work has shown that pre-trained language models capture social bi...
research
03/14/2023

Do Transformers Parse while Predicting the Masked Word?

Pre-trained language models have been shown to encode linguistic structu...
research
03/13/2023

Addressing Biases in the Texts using an End-to-End Pipeline Approach

The concept of fairness is gaining popularity in academia and industry. ...
research
09/21/2022

Bias at a Second Glance: A Deep Dive into Bias for German Educational Peer-Review Data Modeling

Natural Language Processing (NLP) has become increasingly utilized to pr...
research
03/23/2020

Data-driven models and computational tools for neurolinguistics: a language technology perspective

In this paper, our focus is the connection and influence of language tec...

Please sign up or login with your details

Forgot password? Click here to reset