BREEDS: Benchmarks for Subpopulation Shift

by   Shibani Santurkar, et al.

We develop a methodology for assessing the robustness of models to subpopulation shift—specifically, their ability to generalize to novel data subpopulations that were not observed during training. Our approach leverages the class structure underlying existing datasets to control the data subpopulations that comprise the training and test distributions. This enables us to synthesize realistic distribution shifts whose sources can be precisely controlled and characterized, within existing large-scale datasets. Applying this methodology to the ImageNet dataset, we create a suite of subpopulation shift benchmarks of varying granularity. We then validate that the corresponding shifts are tractable by obtaining human baselines for them. Finally, we utilize these benchmarks to measure the sensitivity of standard model architectures as well as the effectiveness of off-the-shelf train-time robustness interventions. Code and data available at .



There are no comments yet.


page 6

page 23

page 25

page 27

page 28

page 29

page 31


SHIFT15M: Multiobjective Large-Scale Fashion Dataset with Distributional Shifts

Many machine learning algorithms assume that the training data and the t...

Using Synthetic Corruptions to Measure Robustness to Natural Distribution Shifts

Synthetic corruptions gathered into a benchmark are frequently used to m...

Measuring Robustness to Natural Distribution Shifts in Image Classification

We study how robust current ImageNet models are to distribution shifts a...

Retiring Adult: New Datasets for Fair Machine Learning

Although the fairness community has recognized the importance of data, r...

Characterizing Generalization under Out-Of-Distribution Shifts in Deep Metric Learning

Deep Metric Learning (DML) aims to find representations suitable for zer...

Distributionally Robust Models with Parametric Likelihood Ratios

As machine learning models are deployed ever more broadly, it becomes in...

DeepMind Control Suite

The DeepMind Control Suite is a set of continuous control tasks with a s...

Code Repositories


A library for experimenting with, training and evaluating neural networks, with a focus on adversarial robustness.

view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.