Bringing the People Back In: Contesting Benchmark Machine Learning Datasets

07/14/2020
by   Emily Denton, et al.
0

In response to algorithmic unfairness embedded in sociotechnical systems, significant attention has been focused on the contents of machine learning datasets which have revealed biases towards white, cisgender, male, and Western data subjects. In contrast, comparatively less attention has been paid to the histories, values, and norms embedded in such datasets. In this work, we outline a research program - a genealogy of machine learning data - for investigating how and why these datasets have been created, what and whose values influence the choices of data to collect, the contextual and contingent conditions of their creation. We describe the ways in which benchmark datasets in machine learning operate as infrastructure and pose four research questions for these datasets. This interrogation forces us to "bring the people back in" by aiding us in understanding the labor embedded in dataset construction, and thereby presenting new avenues of contestation for other researchers encountering the data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/09/2020

Data and its (dis)contents: A survey of dataset development and use in machine learning research

Datasets have played a foundational role in the advancement of machine l...
research
12/03/2021

Reduced, Reused and Recycled: The Life of a Dataset in Machine Learning Research

Benchmark datasets play a central role in the organization of machine le...
research
11/19/2021

Data Excellence for AI: Why Should You Care

The efficacy of machine learning (ML) models depends on both algorithms ...
research
05/18/2023

MiraBest: A Dataset of Morphologically Classified Radio Galaxies for Machine Learning

The volume of data from current and future observatories has motivated t...
research
12/21/2022

NADBenchmarks – a compilation of Benchmark Datasets for Machine Learning Tasks related to Natural Disasters

Climate change has increased the intensity, frequency, and duration of e...

Please sign up or login with your details

Forgot password? Click here to reset