Are Chess Discussions Racist? An Adversarial Hate Speech Data Set

11/20/2020
by   Rupak Sarkar, et al.
0

On June 28, 2020, while presenting a chess podcast on Grandmaster Hikaru Nakamura, Antonio Radić's YouTube handle got blocked because it contained "harmful and dangerous" content. YouTube did not give further specific reason, and the channel got reinstated within 24 hours. However, Radić speculated that given the current political situation, a referral to "black against white", albeit in the context of chess, earned him this temporary ban. In this paper, via a substantial corpus of 681,995 comments, on 8,818 YouTube videos hosted by five highly popular chess-focused YouTube channels, we ask the following research question: how robust are off-the-shelf hate-speech classifiers to out-of-domain adversarial examples? We release a data set of 1,000 annotated comments where existing hate speech classifiers misclassified benign chess discussions as hate speech. We conclude with an intriguing analogy result on racial bias with our findings pointing out to the broader challenge of color polysemy.

READ FULL TEXT

page 1

page 2

page 3

research
06/30/2019

YouTube Chatter: Understanding Online Comments Discourse on Misinformative and Political YouTube Videos

We conduct a preliminary analysis of comments on political YouTube conte...
research
01/29/2023

Vicarious Offense and Noise Audit of Offensive Speech Classifiers

This paper examines social web content moderation from two key perspecti...
research
04/12/2021

Cross-Partisan Discussions on YouTube: Conservatives Talk to Liberals but Liberals Don't Talk to Conservatives

We present the first large-scale measurement study of cross-partisan dis...
research
07/30/2018

YouTube AV 50K: an Annotated Corpus for Comments in Autonomous Vehicles

With one billion monthly viewers, and millions of users discussing and s...
research
11/11/2020

Matching Theory and Data with Personal-ITY: What a Corpus of Italian YouTube Comments Reveals About Personality

As a contribution to personality detection in languages other than Engli...
research
02/17/2022

'Beach' to 'Bitch': Inadvertent Unsafe Transcription of Kids' Content on YouTube

Over the last few years, YouTube Kids has emerged as one of the highly c...
research
11/11/2020

Personal-ITY: A Novel YouTube-based Corpus for Personality Prediction in Italian

We present a novel corpus for personality prediction in Italian, contain...

Please sign up or login with your details

Forgot password? Click here to reset