Automated Hate Speech Detection and the Problem of Offensive Language

03/11/2017
by   Thomas Davidson, et al.
0

A key challenge for automatic hate-speech detection on social media is the separation of hate speech from other instances of offensive language. Lexical detection methods tend to have low precision because they classify all messages containing particular terms as hate speech and previous work using supervised learning has failed to distinguish between the two categories. We used a crowd-sourced hate speech lexicon to collect tweets containing hate speech keywords. We use crowd-sourcing to label a sample of these tweets into three categories: those containing hate speech, only offensive language, and those with neither. We train a multi-class classifier to distinguish between these different categories. Close analysis of the predictions and the errors shows when we can reliably separate hate speech from other offensive language and when this differentiation is more difficult. We find that racist and homophobic tweets are more likely to be classified as hate speech but that sexist tweets are generally classified as offensive. Tweets without explicit hate keywords are also more difficult to classify.

READ FULL TEXT
research
05/12/2020

Intersectional Bias in Hate Speech and Abusive Language Datasets

Algorithms are widely applied to detect hate speech and abusive language...
research
08/05/2021

Hate Speech Detection in Roman Urdu

Hate speech is a specific type of controversial content that is widely l...
research
12/12/2018

Detecting weak and strong Islamophobic hate speech on social media

Islamophobic hate speech on social media inflicts considerable harm on b...
research
08/19/2021

A Multi-input Multi-output Transformer-based Hybrid Neural Network for Multi-class Privacy Disclosure Detection

The concern regarding users' data privacy has risen to its highest level...
research
03/16/2021

dictNN: A Dictionary-Enhanced CNN Approach for Classifying Hate Speech on Twitter

Hate speech on social media is a growing concern, and automated methods ...
research
02/20/2017

Filtering Tweets for Social Unrest

Since the events of the Arab Spring, there has been increased interest i...
research
03/28/2017

Is This a Joke? Detecting Humor in Spanish Tweets

While humor has been historically studied from a psychological, cognitiv...

Please sign up or login with your details

Forgot password? Click here to reset