Demoting Racial Bias in Hate Speech Detection

05/25/2020
by   Mengzhou Xia, et al.
0

In current hate speech datasets, there exists a high correlation between annotators' perceptions of toxicity and signals of African American English (AAE). This bias in annotated training data and the tendency of machine learning models to amplify it cause AAE text to often be mislabeled as abusive/offensive/hate speech with a high false positive rate by current hate speech classifiers. In this paper, we use adversarial training to mitigate this bias, introducing a hate speech classifier that learns to detect toxic sentences while demoting confounds corresponding to AAE texts. Experimental results on a hate speech dataset and an AAE dataset suggest that our method is able to substantially reduce the false positive rate for AAE text while only minimally affecting the performance of hate speech classification.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/15/2018

tau-FPL: Tolerance-Constrained Learning in Linear Time

Learning a classifier with control on the false-positive rate plays a cr...
research
10/07/2022

A Keyword Based Approach to Understanding the Overpenalization of Marginalized Groups by English Marginal Abuse Models on Twitter

Harmful content detection models tend to have higher false positive rate...
research
08/30/2021

An Enhanced Machine Learning Topic Classification Methodology for Cybersecurity

In this research, we use user defined labels from three internet text so...
research
11/02/2021

Towards Text-based Phishing Detection

This paper reports on an experiment into text-based phishing detection u...
research
05/06/2022

Necessity and Sufficiency for Explaining Text Classifiers: A Case Study in Hate Speech Detection

We present a novel feature attribution method for explaining text classi...
research
10/26/2021

Coherent False Seizure Prediction in Epilepsy, Coincidence or Providence?

Seizure forecasting using machine learning is possible, but the performa...

Please sign up or login with your details

Forgot password? Click here to reset