Bugs in the Data: How ImageNet Misrepresents Biodiversity

08/24/2022
by   Alexandra Sasha Luccioni, et al.
0

ImageNet-1k is a dataset often used for benchmarking machine learning (ML) models and evaluating tasks such as image recognition and object detection. Wild animals make up 27 and objects, these data have not been closely scrutinized. In the current paper, we analyze the 13,450 images from 269 classes that represent wild animals in the ImageNet-1k validation set, with the participation of expert ecologists. We find that many of the classes are ill-defined or overlapping, and that 12 >90 and images included in ImageNet-1k present significant geographical and cultural biases, as well as ambiguities such as artificial animals, multiple species in the same image, or the presence of humans. Our findings highlight serious issues with the extensive use of this dataset for evaluating ML systems, the use of such algorithms in wildlife-related tasks, and more broadly the ways in which ML datasets are commonly created and curated.

READ FULL TEXT

page 2

page 9

research
05/03/2020

Machine Learning Pipeline for Pulsar Star Dataset

This work brings together some of the most common machine learning (ML) ...
research
06/26/2023

The race to robustness: exploiting fragile models for urban camouflage and the imperative for machine learning security

Adversarial Machine Learning (AML) represents the ability to disrupt Mac...
research
08/24/2021

OOWL500: Overcoming Dataset Collection Bias in the Wild

The hypothesis that image datasets gathered online "in the wild" can pro...
research
01/23/2022

Out of Distribution Detection on ImageNet-O

Out of distribution (OOD) detection is a crucial part of making machine ...
research
11/22/2021

Evaluating Adversarial Attacks on ImageNet: A Reality Check on Misclassification Classes

Although ImageNet was initially proposed as a dataset for performance be...
research
08/01/2018

Tackling Android Stego Apps in the Wild

Digital image forensics is a young but maturing field, encompassing key ...
research
09/24/2021

From images in the wild to video-informed image classification

Image classifiers work effectively when applied on structured images, ye...

Please sign up or login with your details

Forgot password? Click here to reset