MetaShift: A Dataset of Datasets for Evaluating Contextual Distribution Shifts and Training Conflicts

02/14/2022
by   Weixin Liang, et al.
0

Understanding the performance of machine learning models across diverse data distributions is critically important for reliable applications. Motivated by this, there is a growing focus on curating benchmark datasets that capture distribution shifts. While valuable, the existing benchmarks are limited in that many of them only contain a small number of shifts and they lack systematic annotation about what is different across different shifts. We present MetaShift–a collection of 12,868 sets of natural images across 410 classes–to address this challenge. We leverage the natural heterogeneity of Visual Genome and its annotations to construct MetaShift. The key construction idea is to cluster images using its metadata, which provides context for each image (e.g. "cats with cars" or "cats in bathroom") that represent distinct data distributions. MetaShift has two important benefits: first, it contains orders of magnitude more natural data shifts than previously available. Second, it provides explicit explanations of what is unique about each of its data sets and a distance score that measures the amount of distribution shift between any two of its data sets. We demonstrate the utility of MetaShift in benchmarking several recent proposals for training models to be robust to data shifts. We find that the simple empirical risk minimization performs the best when shifts are moderate and no method had a systematic advantage for large shifts. We also show how MetaShift can help to visualize conflicts between data subsets during model training.

READ FULL TEXT
research
06/30/2022

GSCLIP : A Framework for Explaining Distribution Shifts in Natural Language

Helping end users comprehend the abstract distribution shifts can greatl...
research
06/07/2021

OoD-Bench: Benchmarking and Understanding Out-of-Distribution Generalization Datasets and Algorithms

Deep learning has achieved tremendous success with independent and ident...
research
07/11/2023

On the Need for a Language Describing Distribution Shifts: Illustrations on Tabular Datasets

Different distribution shifts require different algorithmic and operatio...
research
05/24/2023

Non-adversarial Robustness of Deep Learning Methods for Computer Vision

Non-adversarial robustness, also known as natural robustness, is a prope...
research
01/24/2022

An Empirical Investigation of Model-to-Model Distribution Shifts in Trained Convolutional Filters

We present first empirical results from our ongoing investigation of dis...
research
11/14/2022

Self-training of Machine Learning Models for Liver Histopathology: Generalization under Clinical Shifts

Histopathology images are gigapixel-sized and include features and infor...
research
05/04/2023

On the nonlinear correlation of ML performance between data subpopulations

Understanding the performance of machine learning (ML) models across div...

Please sign up or login with your details

Forgot password? Click here to reset