All You Need is "Love": Evading Hate-speech Detection

08/28/2018
by   Tommi Gröndahl, et al.
0

With the spread of social networks and their unfortunate use for hate speech, automatic detection of the latter has become a pressing problem. In this paper, we reproduce seven state-of-the-art hate speech detection models from prior work, and show that they perform well only when tested on the same type of data they were trained on. Based on these results, we argue that for successful hate speech detection, model architecture is less important than the type of data and labeling criteria. We further show that all proposed detection techniques are brittle against adversaries who can (automatically) insert typos, change word boundaries or add innocuous words to the original hate speech. A combination of these methods is also effective against Google Perspective -- a cutting-edge solution from industry. Our experiments demonstrate that adversarial training does not completely mitigate the attacks, and using character-level features makes the models systematically more attack-resistant than using word-level features.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/22/2021

Statistical Analysis of Perspective Scores on Hate Speech Detection

Hate speech detection has become a hot topic in recent years due to the ...
research
06/01/2021

Improving Automatic Hate Speech Detection with Multiword Expression Features

The task of automatically detecting hate speech in social media is gaini...
research
06/24/2021

Hate Speech Detection in Clubhouse

With the rise of voice chat rooms, a gigantic resource of data can be ex...
research
09/24/2022

Joint Speech Activity and Overlap Detection with Multi-Exit Architecture

Overlapped speech detection (OSD) is critical for speech applications in...
research
10/05/2019

Hate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation

This work addresses the challenge of hate speech detection in Internet m...
research
07/12/2019

Automated Word Stress Detection in Russian

In this study we address the problem of automated word stress detection ...
research
10/20/2014

Supervised mid-level features for word image representation

This paper addresses the problem of learning word image representations:...

Please sign up or login with your details

Forgot password? Click here to reset