Measuring Bias in Contextualized Word Representations

06/18/2019
by   Keita Kurita, et al.
0

Contextual word embeddings such as BERT have achieved state of the art performance in numerous NLP tasks. Since they are optimized to capture the statistical properties of training data, they tend to pick up on and amplify social stereotypes present in the data as well. In this study, we (1) propose a template-based method to quantify bias in BERT; (2) show that this method obtains more consistent results in capturing social biases than the traditional cosine based method; and (3) conduct a case study, evaluating gender bias in a downstream task of Gender Pronoun Resolution. Although our case study focuses on gender bias, the proposed technique is generalizable to unveiling other biases, including in multiclass settings, such as racial and religious biases.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/04/2019

Assessing Social and Intersectional Biases in Contextualized Word Representations

Social bias in machine learning has drawn significant attention, with wo...
research
07/03/2022

Counterfactually Measuring and Eliminating Social Bias in Vision-Language Pre-training Models

Vision-Language Pre-training (VLP) models have achieved state-of-the-art...
research
06/27/2023

Gender Bias in BERT – Measuring and Analysing Biases through Sentiment Rating in a Realistic Downstream Classification Task

Pretrained language models are publicly available and constantly finetun...
research
06/20/2022

Fewer Errors, but More Stereotypes? The Effect of Model Size on Gender Bias

The size of pretrained models is increasing, and so is their performance...
research
03/28/2022

The SAME score: Improved cosine based bias score for word embeddings

Over the last years, word and sentence embeddings have established as te...
research
11/15/2021

Evaluating Metrics for Bias in Word Embeddings

Over the last years, word and sentence embeddings have established as te...
research
06/29/2021

Sexism in the Judiciary

We analyze 6.7 million case law documents to determine the presence of g...

Please sign up or login with your details

Forgot password? Click here to reset