Log In Sign Up

What do Bias Measures Measure?

by   Sunipa Dev, et al.

Natural Language Processing (NLP) models propagate social biases about protected attributes such as gender, race, and nationality. To create interventions and mitigate these biases and associated harms, it is vital to be able to detect and measure such biases. While many existing works propose bias evaluation methodologies for different tasks, there remains a need to cohesively understand what biases and normative harms each of these measures captures and how different measures compare. To address this gap, this work presents a comprehensive survey of existing bias measures in NLP as a function of the associated NLP tasks, metrics, datasets, and social biases and corresponding harms. This survey also organizes metrics into different categories to present advantages and disadvantages. Finally, we propose a documentation standard for bias measures to aid their development, categorization, and appropriate usage.


page 1

page 2

page 3

page 4


A Survey on Gender Bias in Natural Language Processing

Language can be used as a means of reproducing and enforcing harmful ste...

Evaluation Evaluation a Monte Carlo study

Over the last decade there has been increasing concern about the biases ...

Evaluating Debiasing Techniques for Intersectional Biases

Bias is pervasive in NLP models, motivating the development of automatic...

A Survey of Race, Racism, and Anti-Racism in NLP

Despite inextricable ties between race and language, little work has con...

Measuring Model Biases in the Absence of Ground Truth

Recent advances in computer vision have led to the development of image ...

Undesirable biases in NLP: Averting a crisis of measurement

As Natural Language Processing (NLP) technology rapidly develops and spr...

Evaluating Metrics for Bias in Word Embeddings

Over the last years, word and sentence embeddings have established as te...