Vicarious Offense and Noise Audit of Offensive Speech Classifiers

This paper examines social web content moderation from two key perspectives: automated methods (machine moderators) and human evaluators (human moderators). We conduct a noise audit at an unprecedented scale using nine machine moderators trained on well-known offensive speech data sets evaluated on a corpus sampled from 92 million YouTube comments discussing a multitude of issues relevant to US politics. We introduce a first-of-its-kind data set of vicarious offense. We ask annotators: (1) if they find a given social media post offensive; and (2) how offensive annotators sharing different political beliefs would find the same content. Our experiments with machine moderators reveal that moderation outcomes wildly vary across different machine moderators. Our experiments with human moderators suggest that (1) political leanings considerably affect first-person offense perspective; (2) Republicans are the worst predictors of vicarious offense; (3) predicting vicarious offense for the Republicans is most challenging than predicting vicarious offense for the Independents and the Democrats; and (4) disagreement across political identity groups considerably increases when sensitive issues such as reproductive rights or gun control/rights are discussed. Both experiments suggest that offense, is indeed, highly subjective and raise important questions concerning content moderation practices.

READ FULL TEXT

page 2

page 6

page 8

page 16

research
11/20/2020

Are Chess Discussions Racist? An Adversarial Hate Speech Data Set

On June 28, 2020, while presenting a chess podcast on Grandmaster Hikaru...
research
09/18/2021

Popping the bubble may not be enough: news media role in online political polarization

Politics in different countries show diverse degrees of polarization, wh...
research
10/05/2020

We Don't Speak the Same Language: Interpreting Polarization through Machine Translation

Polarization among US political parties, media and elites is a widely st...
research
06/30/2019

YouTube Chatter: Understanding Online Comments Discourse on Misinformative and Political YouTube Videos

We conduct a preliminary analysis of comments on political YouTube conte...
research
05/23/2023

Diverse Perspectives Can Mitigate Political Bias in Crowdsourced Content Moderation

In recent years, social media companies have grappled with defining and ...
research
06/05/2019

The Language of Dialogue Is Complex

Integrative Complexity (IC) is a psychometric that measures the ability ...
research
11/11/2022

CoRAL: a Context-aware Croatian Abusive Language Dataset

In light of unprecedented increases in the popularity of the internet an...

Please sign up or login with your details

Forgot password? Click here to reset