Nichelle and Nancy: The Influence of Demographic Attributes and Tokenization Length on First Name Biases

05/26/2023
by   Haozhe An, et al.
0

Through the use of first name substitution experiments, prior research has demonstrated the tendency of social commonsense reasoning models to systematically exhibit social biases along the dimensions of race, ethnicity, and gender (An et al., 2023). Demographic attributes of first names, however, are strongly correlated with corpus frequency and tokenization length, which may influence model behavior independent of or in addition to demographic factors. In this paper, we conduct a new series of first name substitution experiments that measures the influence of these factors while controlling for the others. We find that demographic attributes of a name (race, ethnicity, and gender) and name tokenization length are both factors that systematically affect the behavior of social commonsense reasoning models.

READ FULL TEXT

page 9

page 12

research
10/13/2022

SODAPOP: Open-Ended Discovery of Social Biases in Social Commonsense Reasoning Models

A common limitation of diagnostic tests for detecting social biases in N...
research
02/24/2020

Multilingual Twitter Corpus and Baselines for Evaluating Demographic Bias in Hate Speech Recognition

Existing research on fairness evaluation of document classification mode...
research
04/29/2023

Visualizing chest X-ray dataset biases using GANs

Recent work demonstrates that images from various chest X-ray datasets c...
research
06/01/2023

Examining the Causal Effect of First Names on Language Models: The Case of Social Commonsense Reasoning

As language models continue to be integrated into applications of person...
research
04/09/2020

PANDORA Talks: Personality and Demographics on Reddit

Personality and demographics are important variables in social sciences,...
research
04/09/2022

Crime Patterns in Los Angeles County Before and After Covid19 (2018-2021)

The objective of our research is to present the change in crime rates in...
research
10/09/2018

Using Sentiment Representation Learning to Enhance Gender Classification for User Profiling

User profiling means exploiting the technology of machine learning to pr...

Please sign up or login with your details

Forgot password? Click here to reset