Reading Between the Demographic Lines: Resolving Sources of Bias in Toxicity Classifiers

06/29/2020
by   Elizabeth Reichert, et al.
0

The censorship of toxic comments is often left to the judgment of imperfect models. Perspective API, a creation of Google technology incubator Jigsaw, is perhaps the most widely used toxicity classifier in industry; the model is employed by several online communities including The New York Times to identify and filter out toxic comments with the goal of preserving online safety. Unfortunately, Google's model tends to unfairly assign higher toxicity scores to comments containing words referring to the identities of commonly targeted groups (e.g., "woman,” "gay,” etc.) because these identities are frequently referenced in a disrespectful manner in the training data. As a result, comments generated by marginalized groups referencing their identities are often mistakenly censored. It is important to be cognizant of this unintended bias and strive to mitigate its effects. To address this issue, we have constructed several toxicity classifiers with the intention of reducing unintended bias while maintaining strong classification performance.

READ FULL TEXT
research
06/04/2021

Towards Equal Gender Representation in the Annotations of Toxic Language Detection

Classifiers tend to propagate biases present in the data on which they a...
research
09/21/2019

Empirical Analysis of Multi-Task Learning for Reducing Model Bias in Toxic Comment Detection

With the recent rise of toxicity in online conversations on social media...
research
05/01/2022

Is Your Toxicity My Toxicity? Exploring the Impact of Rater Identity on Toxicity Annotation

Machine learning models are commonly used to detect toxicity in online c...
research
03/10/2021

Identifying bot activity in GitHub pull request and issue comments

Development bots are used on Github to automate repetitive activities. S...
research
10/22/2020

Reducing Unintended Identity Bias in Russian Hate Speech Detection

Toxicity has become a grave problem for many online communities and has ...
research
07/05/2017

Like trainer, like bot? Inheritance of bias in algorithmic content moderation

The internet has become a central medium through which `networked public...
research
01/28/2019

"And We Will Fight For Our Race!'" A Measurement Study of Genetic Testing Conversations on Reddit and 4chan

Rapid progress in genomics has enabled a thriving market for "direct-to-...

Please sign up or login with your details

Forgot password? Click here to reset