Evaluation of African American Language Bias in Natural Language Generation

05/23/2023
by   Nicholas Deas, et al.
0

We evaluate how well LLMs understand African American Language (AAL) in comparison to their performance on White Mainstream English (WME), the encouraged "standard" form of English taught in American classrooms. We measure LLM performance using automatic metrics and human judgments for two tasks: a counterpart generation task, where a model generates AAL (or WME) given WME (or AAL), and a masked span prediction (MSP) task, where models predict a phrase that was removed from their input. Our contributions include: (1) evaluation of six pre-trained, large language models on the two language generation tasks; (2) a novel dataset of AAL text from multiple contexts (social media, hip-hop lyrics, focus groups, and linguistic interviews) with human-annotated counterparts in WME; and (3) documentation of model performance gaps that suggest bias and identification of trends in lack of understanding of AAL features.

READ FULL TEXT
research
11/02/2020

A Closer Look at Linguistic Knowledge in Masked Language Models: The Case of Relative Clauses in American English

Transformer-based language models achieve high performance on various ta...
research
09/18/2023

Evaluating Gender Bias of Pre-trained Language Models in Natural Language Inference by Considering All Labels

Discriminatory social biases, including gender biases, have been found i...
research
05/31/2023

Pre-Trained Language-Meaning Models for Multilingual Parsing and Generation

Pre-trained language models (PLMs) have achieved great success in NLP an...
research
04/12/2023

LINGO : Visually Debiasing Natural Language Instructions to Support Task Diversity

Cross-task generalization is a significant outcome that defines mastery ...
research
04/06/2020

BERT in Negotiations: Early Prediction of Buyer-Seller Negotiation Outcomes

The task of building automatic agents that can negotiate with humans in ...
research
10/06/2020

Help! Need Advice on Identifying Advice

Humans use language to accomplish a wide variety of tasks - asking for a...
research
09/15/2022

Measuring Geographic Performance Disparities of Offensive Language Classifiers

Text classifiers are applied at scale in the form of one-size-fits-all s...

Please sign up or login with your details

Forgot password? Click here to reset