SODAPOP: Open-Ended Discovery of Social Biases in Social Commonsense Reasoning Models

10/13/2022
by   Haozhe An, et al.
4

A common limitation of diagnostic tests for detecting social biases in NLP models is that they may only detect stereotypic associations that are pre-specified by the designer of the test. Since enumerating all possible problematic associations is infeasible, it is likely these tests fail to detect biases that are present in a model but not pre-specified by the designer. To address this limitation, we propose SODAPOP (SOcial bias Discovery from Answers about PeOPle) in social commonsense question-answering. Our pipeline generates modified instances from the Social IQa dataset (Sap et al., 2019) by (1) substituting names associated with different demographic groups, and (2) generating many distractor answers from a masked language model. By using a social commonsense model to score the generated distractors, we are able to uncover the model's stereotypic associations between demographic groups and an open set of words. We also test SODAPOP on debiased models and show the limitations of multiple state-of-the-art debiasing algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/26/2023

Nichelle and Nancy: The Influence of Demographic Attributes and Tokenization Length on First Name Biases

Through the use of first name substitution experiments, prior research h...
research
05/28/2023

KoSBi: A Dataset for Mitigating Social Bias Risks Towards Safer Large Language Model Application

Large language models (LLMs) learn not only natural text generation abil...
research
11/10/2019

Social Bias Frames: Reasoning about Social and Power Implications of Language

Language has the power to reinforce stereotypes and project social biase...
research
05/24/2023

Uncovering and Quantifying Social Biases in Code Generation

With the popularity of automatic code generation tools, such as Copilot,...
research
06/23/2022

Theory-Grounded Measurement of U.S. Social Stereotypes in English Language Models

NLP models trained on text have been shown to reproduce human stereotype...
research
10/15/2021

BBQ: A Hand-Built Bias Benchmark for Question Answering

It is well documented that NLP models learn social biases present in the...
research
06/03/2021

Men Are Elected, Women Are Married: Events Gender Bias on Wikipedia

Human activities can be seen as sequences of events, which are crucial t...

Please sign up or login with your details

Forgot password? Click here to reset