Predicting the Type and Target of Offensive Posts in Social Media

02/25/2019
by   Marcos Zampieri, et al.
0

As offensive content has become pervasive in social media, there has been much research on identifying potentially offensive messages. Previous work in this area, however, did not consider the problem as a whole, but rather focused on detecting very specific types of offensive content, e.g., hate speech, cyberbulling, or cyber-aggression. In contrast, here we target several different kinds of offensive content. In particular, we propose to model the task hierarchically, identifying the type and the target of offensive messages in social media. We use the Offensive Language Identification Dataset (OLID), a new dataset with a fine-grained three-layer annotation scheme compiled specifically for this purpose. OLID, which we make publicly available, contains tweets annotated for offensive content. We discuss the main similarities and differences of this dataset compared to other datasets for hate speech identification, aggression detection, and similar tasks. We also evaluate the data with a number of classification methods for this task.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/05/2023

Detecting and Reasoning of Deleted Tweets before they are Posted

Social media platforms empower us in several ways, from information diss...
research
04/21/2022

Identifying and Characterizing Active Citizens who Refute Misinformation in Social Media

The phenomenon of misinformation spreading in social media has developed...
research
03/27/2021

Annotating Hate and Offenses on Social Media

This paper describes a corpus annotation process to support the identifi...
research
01/09/2020

Offensive Language Detection: A Comparative Analysis

Offensive behaviour has become pervasive in the Internet community. Indi...
research
09/15/2021

An influencer-based approach to understanding radical right viral tweets

Radical right influencers routinely use social media to spread highly di...
research
08/21/2023

BAN-PL: a Novel Polish Dataset of Banned Harmful and Offensive Content from Wykop.pl web service

Advances in automated detection of offensive language online, including ...
research
08/25/2022

Aggression and "hate speech" in communication of media users: analysis of control capabilities

Analyzing the possibilities of mutual influence of users in new media, t...

Please sign up or login with your details

Forgot password? Click here to reset