Who Decides if AI is Fair? The Labels Problem in Algorithmic Auditing

11/16/2021
by   Abhilash Mishra, et al.
0

Labelled "ground truth" datasets are routinely used to evaluate and audit AI algorithms applied in high-stakes settings. However, there do not exist widely accepted benchmarks for the quality of labels in these datasets. We provide empirical evidence that quality of labels can significantly distort the results of algorithmic audits in real-world settings. Using data annotators typically hired by AI firms in India, we show that fidelity of the ground truth data can lead to spurious differences in performance of ASRs between urban and rural populations. After a rigorous, albeit expensive, label cleaning process, these disparities between groups disappear. Our findings highlight how trade-offs between label quality and data annotation costs can complicate algorithmic audits in practice. They also emphasize the need for development of consensus-driven, widely accepted benchmarks for label quality.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/08/2023

Generating the Ground Truth: Synthetic Data for Label Noise Research

Most real-world classification tasks suffer from label noise to some ext...
research
10/02/2020

Deep Learning for Earth Image Segmentation based on Imperfect Polyline Labels with Annotation Errors

In recent years, deep learning techniques (e.g., U-Net, DeepLab) have ac...
research
05/29/2023

Datasets for Portuguese Legal Semantic Textual Similarity: Comparing weak supervision and an annotation process approaches

The Brazilian judiciary has a large workload, resulting in a long time t...
research
07/13/2022

Beyond Hard Labels: Investigating data label distributions

High-quality data is a key aspect of modern machine learning. However, l...
research
05/01/2019

Automatic Dataset Augmentation Using Virtual Human Simulation

Virtual Human Simulation has been widely used for different purposes, su...
research
11/04/2022

The 'Problem' of Human Label Variation: On Ground Truth in Data, Modeling and Evaluation

Human variation in labeling is often considered noise. Annotation projec...
research
05/09/2018

The Dataset Nutrition Label: A Framework To Drive Higher Data Quality Standards

Artificial intelligence (AI) systems built on incomplete or biased data ...

Please sign up or login with your details

Forgot password? Click here to reset