Identifying Statistical Bias in Dataset Replication

05/19/2020
by   Logan Engstrom, et al.
10

Dataset replication is a useful tool for assessing whether improvements in test accuracy on a specific benchmark correspond to improvements in models' ability to generalize reliably. In this work, we present unintuitive yet significant ways in which standard approaches to dataset replication introduce statistical bias, skewing the resulting observations. We study ImageNet-v2, a replication of the ImageNet dataset on which models exhibit a significant (11-14 human-in-the-loop measure of data quality. We show that after correcting for the identified statistical bias, only an estimated 3.6%± 1.5% of the original 11.7%± 1.0% accuracy drop remains unaccounted for. We conclude with concrete recommendations for recognizing and avoiding bias in dataset replication. Code for our study is publicly available at http://github.com/MadryLab/dataset-replication-analysis .

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 3

page 19

page 22

page 29

page 30

06/11/2021

Cross-replication Reliability – An Empirical Approach to Interpreting Inter-rater Reliability

We present a new approach to interpreting IRR that is empirical and cont...
03/20/2019

Statistical Methods for Replicability Assessment

Large-scale replication studies like the Reproducibility Project: Psycho...
01/15/2018

Conceptualizing and Evaluating Replication Across Domains of Behavioral Research

We discuss the authors' conceptualization of replication, in particular ...
09/16/2020

The assessment of replication success based on relative effect size

Replication studies are increasingly conducted to confirm original findi...
06/13/2018

Enabling End-To-End Machine Learning Replicability: A Case Study in Educational Data Mining

The use of machine learning techniques has expanded in education researc...
02/23/2022

When do GANs replicate? On the choice of dataset size

Do GANs replicate training images? Previous studies have shown that GANs...
08/06/2015

Replication and Generalization of PRECISE

This report describes an initial replication study of the PRECISE system...

Code Repositories

dataset-replication-analysis

None


view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.