DeepAI AI Chat
Log In Sign Up

An Empirical Investigation of Learning from Biased Toxicity Labels

10/04/2021
by   Neel Nanda, et al.
0

Collecting annotations from human raters often results in a trade-off between the quantity of labels one wishes to gather and the quality of these labels. As such, it is often only possible to gather a small amount of high-quality labels. In this paper, we study how different training strategies can leverage a small dataset of human-annotated labels and a large but noisy dataset of synthetically generated labels (which exhibit bias against identity groups) for predicting toxicity of online comments. We evaluate the accuracy and fairness properties of these approaches, and trade-offs between the two. While we find that initial training on all of the data and fine-tuning on clean data produces models with the highest AUC, we find that no single strategy performs best across all fairness metrics.

READ FULL TEXT

page 1

page 2

page 3

page 4

04/28/2023

HQP: A Human-Annotated Dataset for Detecting Online Propaganda

Online propaganda poses a severe threat to the integrity of societies. H...
04/15/2022

Fairly Accurate: Learning Optimal Accuracy vs. Fairness Tradeoffs for Hate Speech Detection

Recent work has emphasized the importance of balancing competing objecti...
01/06/2017

Learning From Noisy Large-Scale Datasets With Minimal Supervision

We present an approach to effectively use millions of images with noisy ...
05/02/2023

On the Impact of Data Quality on Image Classification Fairness

With the proliferation of algorithmic decision-making, increased scrutin...
05/24/2022

Beyond Impossibility: Balancing Sufficiency, Separation and Accuracy

Among the various aspects of algorithmic fairness studied in recent year...
06/29/2021

INN: A Method Identifying Clean-annotated Samples via Consistency Effect in Deep Neural Networks

In many classification problems, collecting massive clean-annotated data...
06/07/2023

3D Human Keypoints Estimation From Point Clouds in the Wild Without Human Labels

Training a 3D human keypoint detector from point clouds in a supervised ...