Change is Hard: A Closer Look at Subpopulation Shift

02/23/2023
by   Yuzhe Yang, et al.
0

Machine learning models often perform poorly on subgroups that are underrepresented in the training data. Yet, little is understood on the variation in mechanisms that cause subpopulation shifts, and how algorithms generalize across such diverse shifts at scale. In this work, we provide a fine-grained analysis of subpopulation shift. We first propose a unified framework that dissects and explains common shifts in subgroups. We then establish a comprehensive benchmark of 20 state-of-the-art algorithms evaluated on 12 real-world datasets in vision, language, and healthcare domains. With results obtained from training over 10,000 models, we reveal intriguing observations for future progress in this space. First, existing algorithms only improve subgroup robustness over certain types of shifts but not others. Moreover, while current algorithms rely on group-annotated validation data for model selection, we find that a simple selection criterion based on worst-class accuracy is surprisingly effective even without any group information. Finally, unlike existing works that solely aim to improve worst-group accuracy (WGA), we demonstrate the fundamental tradeoff between WGA and other important metrics, highlighting the need to carefully choose testing metrics. Code and data are available at: https://github.com/YyzHarry/SubpopBench.

READ FULL TEXT

page 6

page 15

research
10/21/2021

A Fine-Grained Analysis on Distribution Shift

Robustness to distribution shifts is critical for deploying machine lear...
research
08/11/2020

BREEDS: Benchmarks for Subpopulation Shift

We develop a methodology for assessing the robustness of models to subpo...
research
02/06/2023

Bitrate-Constrained DRO: Beyond Worst Case Robustness To Unknown Group Shifts

Training machine learning models robust to distribution shifts is critic...
research
01/02/2022

Improving Out-of-Distribution Robustness via Selective Augmentation

Machine learning algorithms typically assume that training and test exam...
research
02/16/2022

Bias in Automated Image Colorization: Metrics and Error Types

We measure the color shifts present in colorized images from the ADE20K ...
research
09/19/2022

Importance Tempering: Group Robustness for Overparameterized Models

Although overparameterized models have shown their success on many machi...
research
06/19/2023

Confidence-Based Model Selection: When to Take Shortcuts for Subpopulation Shifts

Effective machine learning models learn both robust features that direct...

Please sign up or login with your details

Forgot password? Click here to reset