A study of text representations in Hate Speech Detection

02/08/2021
by   Chrysoula Themeli, et al.
0

The pervasiveness of the Internet and social media have enabled the rapid and anonymous spread of Hate Speech content on microblogging platforms such as Twitter. Current EU and US legislation against hateful language, in conjunction with the large amount of data produced in these platforms has led to automatic tools being a necessary component of the Hate Speech detection task and pipeline. In this study, we examine the performance of several, diverse text representation techniques paired with multiple classification algorithms, on the automatic Hate Speech detection and abusive language discrimination task. We perform an experimental evaluation on binary and multiclass datasets, paired with significance testing. Our results show that simple hate-keyword frequency features (BoW) work best, followed by pre-trained word embeddings (GLoVe) as well as N-gram graphs (NGGs): a graph-based representation which proved to produce efficient, very low-dimensional but rich features for this task. A combination of these representations paired with Logistic Regression or 3-layer neural network classifiers achieved the best detection performance, in terms of micro and macro F-measure.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/14/2021

DeepHate: Hate Speech Detection via Multi-Faceted Text Representations

Online hate speech is an important issue that breaks the cohesiveness of...
research
06/29/2021

Hate speech detection using static BERT embeddings

With increasing popularity of social media platforms hate speech is emer...
research
10/25/2020

CRAB: Class Representation Attentive BERT for Hate Speech Identification in Social Media

In recent years, social media platforms have hosted an explosion of hate...
research
04/16/2019

An Empirical Evaluation of Text Representation Schemes on Multilingual Social Web to Filter the Textual Aggression

This paper attempt to study the effectiveness of text representation sch...
research
10/18/2021

Contextual Hate Speech Detection in Code Mixed Text using Transformer Based Approaches

In the recent past, social media platforms have helped people in connect...
research
03/14/2018

Challenges in Discriminating Profanity from Hate Speech

In this study we approach the problem of distinguishing general profanit...
research
06/01/2021

Improving Automatic Hate Speech Detection with Multiword Expression Features

The task of automatically detecting hate speech in social media is gaini...

Please sign up or login with your details

Forgot password? Click here to reset