Is Importance Weighting Incompatible with Interpolating Classifiers?

12/24/2021
by   Ke Alexander Wang, et al.
57

Importance weighting is a classic technique to handle distribution shifts. However, prior work has presented strong empirical and theoretical evidence demonstrating that importance weights can have little to no effect on overparameterized neural networks. Is importance weighting truly incompatible with the training of overparameterized neural networks? Our paper answers this in the negative. We show that importance weighting fails not because of the overparameterization, but instead, as a result of using exponentially-tailed losses like the logistic or cross-entropy loss. As a remedy, we show that polynomially-tailed losses restore the effects of importance reweighting in correcting distribution shift in overparameterized models. We characterize the behavior of gradient descent on importance weighted polynomially-tailed losses with overparameterized linear models, and theoretically demonstrate the advantage of using polynomially-tailed losses in a label shift setting. Surprisingly, our theory shows that using weights that are obtained by exponentiating the classical unbiased importance weights can improve performance. Finally, we demonstrate the practical value of our analysis with neural network experiments on a subpopulation shift and a label shift dataset. When reweighted, our loss function can outperform reweighted cross-entropy by as much as 9 comparable to, or even exceeding, well-tuned state-of-the-art methods for correcting distribution shifts.

READ FULL TEXT

page 2

page 10

page 11

page 40

research
06/19/2020

Gradient descent follows the regularization path for general losses

Recent work across many machine learning disciplines has highlighted tha...
research
12/08/2018

Weighted Risk Minimization & Deep Learning

Importance weighting is a key ingredient in many algorithms for causal i...
research
09/19/2022

Importance Tempering: Group Robustness for Overparameterized Models

Although overparameterized models have shown their success on many machi...
research
04/09/2023

Reweighted Mixup for Subpopulation Shift

Subpopulation shift exists widely in many real-world applications, which...
research
09/19/2022

UMIX: Improving Importance Weighting for Subpopulation Shift via Uncertainty-Aware Mixup

Subpopulation shift wildly exists in many real-world machine learning ap...
research
02/06/2023

APAM: Adaptive Pre-training and Adaptive Meta Learning in Language Model for Noisy Labels and Long-tailed Learning

Practical natural language processing (NLP) tasks are commonly long-tail...
research
03/28/2021

Understanding the role of importance weighting for deep learning

The recent paper by Byrd Lipton (2019), based on empirical observati...

Please sign up or login with your details

Forgot password? Click here to reset