Measuring the Reliability of Hate Speech Annotations: The Case of the European Refugee Crisis

by   Björn Ross, et al.

Some users of social media are spreading racist, sexist, and otherwise hateful content. For the purpose of training a hate speech detection system, the reliability of the annotations is crucial, but there is no universally agreed-upon definition. We collected potentially hateful messages and asked two groups of internet users to determine whether they were hate speech or not, whether they should be banned or not and to rate their degree of offensiveness. One of the groups was shown a definition prior to completing the survey. We aimed to assess whether hate speech can be annotated reliably, and the extent to which existing definitions are in accordance with subjective ratings. Our results indicate that showing users a definition caused them to partially align their own opinion with the definition but did not improve reliability, which was very low overall. We conclude that the presence of hate speech should perhaps not be considered a binary yes-or-no decision, and raters need more detailed instructions for the annotation.


page 1

page 2

page 3

page 4


Hateful Messages: A Conversational Data Set of Hate Speech produced by Adolescents on Discord

With the rise of social media, a rise of hateful content can be observed...

Annotating Hate and Offenses on Social Media

This paper describes a corpus annotation process to support the identifi...

A Model of Polarization on Social Media Caused by Empathy and Repulsion

In recent years, the ease with which social media can be accessed has le...

Prediction Uncertainty Estimation for Hate Speech Classification

As a result of social network popularity, in recent years, hate speech p...

Annotating Antisemitic Online Content. Towards an Applicable Definition of Antisemitism

Online antisemitism is hard to quantify. How can it be measured in rapid...

Which Argumentative Aspects of Hate Speech in Social Media can be reliably identified?

With the increasing diversity of use cases of large language models, a m...

Hate Speech Criteria: A Modular Approach to Task-Specific Hate Speech Definitions

Offensive Content Warning: This paper contains offensive language only f...

Please sign up or login with your details

Forgot password? Click here to reset