Is Your Toxicity My Toxicity? Exploring the Impact of Rater Identity on Toxicity Annotation

05/01/2022
by   Nitesh Goyal, et al.
0

Machine learning models are commonly used to detect toxicity in online conversations. These models are trained on datasets annotated by human raters. We explore how raters' self-described identities impact how they annotate toxicity in online comments. We first define the concept of specialized rater pools: rater pools formed based on raters' self-described identities, rather than at random. We formed three such rater pools for this study–specialized rater pools of raters from the U.S. who identify as African American, LGBTQ, and those who identify as neither. Each of these rater pools annotated the same set of comments, which contains many references to these identity groups. We found that rater identity is a statistically significant factor in how raters will annotate toxicity for identity-related annotations. Using preliminary content analysis, we examined the comments with the most disagreement between rater pools and found nuanced differences in the toxicity annotations. Next, we trained models on the annotations from each of the different rater pools, and compared the scores of these models on comments from several test sets. Finally, we discuss how using raters that self-identify with the subjects of comments can create more inclusive machine learning models, and provide more nuanced ratings than those by random raters.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/21/2019

Empirical Analysis of Multi-Task Learning for Reducing Model Bias in Toxic Comment Detection

With the recent rise of toxicity in online conversations on social media...
research
06/04/2021

Towards Equal Gender Representation in the Annotations of Toxic Language Detection

Classifiers tend to propagate biases present in the data on which they a...
research
06/29/2020

Reading Between the Demographic Lines: Resolving Sources of Bias in Toxicity Classifiers

The censorship of toxic comments is often left to the judgment of imperf...
research
06/07/2021

Predicting Different Types of Subtle Toxicity in Unhealthy Online Conversations

This paper investigates the use of machine learning models for the class...
research
07/05/2017

Like trainer, like bot? Inheritance of bias in algorithmic content moderation

The internet has become a central medium through which `networked public...
research
10/05/2018

Model Cards for Model Reporting

Trained machine learning models are increasingly used to perform high-im...
research
10/03/2018

Machine Learning Suites for Online Toxicity Detection

To identify and classify toxic online commentary, the modern tools of da...

Please sign up or login with your details

Forgot password? Click here to reset