Annotating Antisemitic Online Content. Towards an Applicable Definition of Antisemitism

09/29/2019
by   Gunther Jikeli, et al.
0

Online antisemitism is hard to quantify. How can it be measured in rapidly growing and diversifying platforms? Are the numbers of antisemitic messages rising proportionally to other content or is it the case that the share of antisemitic content is increasing? How does such content travel and what are reactions to it? How widespread is online Jew-hatred beyond infamous websites and fora, and closed social media groups? However, at the root of many methodological questions is the challenge of finding a consistent way to identify diverse manifestations of antisemitism in large datasets. What is more, a clear definition is essential for building an annotated corpus that can be used as a gold standard for machine learning programs to detect antisemitic online content. We argue that antisemitic content has distinct features that are not captured adequately in generic approaches of annotation, such as hate speech, abusive language, or toxic language. We discuss our experiences with annotating samples from our dataset that draw on a ten percent random sample of public tweets from Twitter. We show that the widely used definition of antisemitism by the International Holocaust Remembrance Alliance can be applied successfully to online messages if inferences are spelled out in detail and if the focus is not on intent of the disseminator but on the message in its context. However, annotators have to be highly trained and knowledgeable about current events to understand each tweet's underlying message within its context. The tentative results of the annotation of two of our small but randomly chosen samples suggest that more than ten percent of conversations on Twitter about Jews and Israel are antisemitic or probably antisemitic. They also show that at least in conversations about Jews, an equally high number of tweets denounce antisemitism, although these conversations do not necessarily coincide.

READ FULL TEXT

page 15

page 16

page 17

research
04/28/2023

Antisemitic Messages? A Guide to High-Quality Annotation and a Labeled Dataset of Tweets

One of the major challenges in automatic hate speech detection is the la...
research
03/27/2021

Abuse is Contextual, What about NLP? The Role of Context in Abusive Language Annotation and Detection

The datasets most widely used for abusive language detection contain lis...
research
04/21/2020

That Message Went Viral?! Exploratory Analytics and Sentiment Analysis into the Propagation of Tweets

Information exchange and message diffusion have moved from traditional m...
research
03/13/2020

WAC: A Corpus of Wikipedia Conversations for Online Abuse Detection

With the spread of online social networks, it is more and more difficult...
research
02/20/2016

Burstiness Scale: a highly parsimonious model for characterizing random series of events

The problem to accurately and parsimoniously characterize random series ...
research
01/27/2017

Measuring the Reliability of Hate Speech Annotations: The Case of the European Refugee Crisis

Some users of social media are spreading racist, sexist, and otherwise h...
research
05/20/2019

Abusive Language Detection in Online Conversations by Combining Content-and Graph-based Features

In recent years, online social networks have allowed worldwide users to ...

Please sign up or login with your details

Forgot password? Click here to reset