Hi, my name is Martha: Using names to measure and mitigate bias in generative dialogue models

09/07/2021
by   Eric Michael Smith, et al.
0

All AI models are susceptible to learning biases in data that they are trained on. For generative dialogue models, being trained on real human conversations containing unbalanced gender and race/ethnicity references can lead to models that display learned biases, which we define here broadly as any measurable differences in the distributions of words or semantic content of conversations based on demographic groups. We measure the strength of such biases by producing artificial conversations between two copies of a dialogue model, conditioning one conversational partner to state a name commonly associated with a certain gender and/or race/ethnicity. We find that larger capacity models tend to exhibit more gender bias and greater stereotyping of occupations by gender. We show that several methods of tuning these dialogue models, specifically name scrambling, controlled generation, and unlikelihood training, are effective in reducing bias in conversation, including on a downstream conversational task. Name scrambling is also effective in lowering differences in token usage across conversations where partners have names associated with different genders or races/ethnicities.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/20/2022

Debiasing NLP Models Without Demographic Information

Models trained from real-world data tend to imitate and amplify social b...
research
11/10/2019

Queens are Powerful too: Mitigating Gender Bias in Dialogue Generation

Models often easily learn biases present in the training data, and their...
research
07/14/2023

Mitigating Bias in Conversations: A Hate Speech Classifier and Debiaser with Prompts

Discriminatory language and biases are often present in hate speech duri...
research
06/07/2021

RedditBias: A Real-World Resource for Bias Evaluation and Debiasing of Conversational Language Models

Text representation models are prone to exhibit a range of societal bias...
research
06/04/2021

Towards Equal Gender Representation in the Annotations of Toxic Language Detection

Classifiers tend to propagate biases present in the data on which they a...
research
01/23/2023

The Reasonable Effectiveness of Diverse Evaluation Data

In this paper, we present findings from an semi-experimental exploration...
research
12/18/2020

Small Business Classification By Name: Addressing Gender and Geographic Origin Biases

Small business classification is a difficult and important task within m...

Please sign up or login with your details

Forgot password? Click here to reset