Racial Bias in Hate Speech and Abusive Language Detection Datasets

05/29/2019
by   Thomas Davidson, et al.
0

Technologies for abusive language detection are being developed and applied with little consideration of their potential biases. We examine racial bias in five different sets of Twitter data annotated for hate speech and abusive language. We train classifiers on these datasets and compare the predictions of these classifiers on tweets written in African-American English with those written in Standard American English. The results show evidence of systematic racial bias in all datasets, as classifiers trained on them tend to predict that tweets written in African-American English are abusive at substantially higher rates. If these abusive language detection systems are used in the field they will therefore have a disproportionate negative impact on African-American social media users. Consequently, these systems may discriminate against the groups who are often the targets of the abuse we are trying to detect.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/14/2020

Hate Speech Detection and Racial Bias Mitigation in Social Media based on BERT model

Disparate biases associated with datasets and trained classifiers in hat...
research
05/12/2020

Intersectional Bias in Hate Speech and Abusive Language Datasets

Algorithms are widely applied to detect hate speech and abusive language...
research
05/26/2020

Examining Racial Bias in an Online Abuse Corpus with Structural Topic Modeling

We use structural topic modeling to examine racial bias in data collecte...
research
10/28/2021

Hate Speech Classifiers Learn Human-Like Social Stereotypes

Social stereotypes negatively impact individuals' judgements about diffe...
research
10/07/2022

A Keyword Based Approach to Understanding the Overpenalization of Marginalized Groups by English Marginal Abuse Models on Twitter

Harmful content detection models tend to have higher false positive rate...
research
10/04/2016

A Computational Approach to Automatic Prediction of Drunk Texting

Alcohol abuse may lead to unsociable behavior such as crime, drunk drivi...
research
09/28/2021

Agreeing to Disagree: Annotating Offensive Language Datasets with Annotators' Disagreement

Since state-of-the-art approaches to offensive language detection rely o...

Please sign up or login with your details

Forgot password? Click here to reset