DeepAI AI Chat
Log In Sign Up

CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models

by   Nikita Nangia, et al.

Pretrained language models, especially masked language models (MLMs) have seen success across many NLP tasks. However, there is ample evidence that they use the cultural biases that are undoubtedly present in the corpora they are trained on, implicitly creating harm with biased representations. To measure some forms of social bias in language models against protected demographic groups in the US, we introduce the Crowdsourced Stereotype Pairs benchmark (CrowS-Pairs). CrowS-Pairs has 1508 examples that cover stereotypes dealing with nine types of bias, like race, religion, and age. In CrowS-Pairs a model is presented with two sentences: one that is more stereotyping and another that is less stereotyping. The data focuses on stereotypes about historically disadvantaged groups and contrasts them with advantaged groups. We find that all three of the widely-used MLMs we evaluate substantially favor sentences that express stereotypes in every category in CrowS-Pairs. As work on building less biased models advances, this dataset can be used as a benchmark to evaluate progress.


page 1

page 2

page 3

page 4


StereoSet: Measuring stereotypical bias in pretrained language models

A stereotype is an over-generalized belief about a particular group of p...

KoSBI: A Dataset for Mitigating Social Bias Risks Towards Safer Large Language Model Application

Large language models (LLMs) learn not only natural text generation abil...

"I'm sorry to hear that": finding bias in language models with a holistic descriptor dataset

As language models grow in popularity, their biases across all possible ...

RedditBias: A Real-World Resource for Bias Evaluation and Debiasing of Conversational Language Models

Text representation models are prone to exhibit a range of societal bias...

Having Beer after Prayer? Measuring Cultural Bias in Large Language Models

Are language models culturally biased? It is important that language mod...

On Measuring Social Biases in Prompt-Based Multi-Task Learning

Large language models trained on a mixture of NLP tasks that are convert...

Investigating representations of verb bias in neural language models

Languages typically provide more than one grammatical construction to ex...