An Investigation of Why Overparameterization Exacerbates Spurious Correlations

05/09/2020
by   Shiori Sagawa, et al.
0

We study why overparameterization – increasing model size well beyond the point of zero training error – can hurt test error on minority groups despite improving average test error when there are spurious correlations in the data. Through simulations and experiments on two image datasets, we identify two key properties of the training data that drive this behavior: the proportions of majority versus minority groups, and the signal-to-noise ratio of the spurious correlations. We then analyze a linear setting and show theoretically how the inductive bias of models towards "memorizing" fewer examples can cause overparameterization to hurt. Our analysis leads to a counterintuitive approach of subsampling the majority group, which empirically achieves low minority error in the overparameterized regime, even though the standard approach of upweighting the minority fails. Overall, our results suggest a tension between using overparameterized models versus using all the training data for achieving low worst-group error.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 2

11/20/2019

Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization

Overparameterized neural networks can be highly accurate on average on a...
04/14/2020

Contrastive Examples for Addressing the Tyranny of the Majority

Computer vision algorithms, e.g. for face recognition, favour groups of ...
07/19/2021

Just Train Twice: Improving Group Robustness without Training Group Information

Standard training via empirical risk minimization (ERM) can produce mode...
01/10/2022

Towards Group Robustness in the presence of Partial Group Labels

Learning invariant representations is an important requirement when trai...
10/06/2021

Focus on the Common Good: Group Distributional Robustness Follows

We consider the problem of training a classification model with group an...
11/08/2021

Statistical properties of large data sets with linear latent features

Analytical understanding of how low-dimensional latent features reveal t...
09/13/2019

Analysis of Solitaire

The Solitaire cipher was designed by Bruce Schneier as a plot point in t...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.