Distributionally Robust Models with Parametric Likelihood Ratios

04/13/2022
by   Paul Michel, et al.
0

As machine learning models are deployed ever more broadly, it becomes increasingly important that they are not only able to perform well on their training distribution, but also yield accurate predictions when confronted with distribution shift. The Distributionally Robust Optimization (DRO) framework proposes to address this issue by training models to minimize their expected risk under a collection of distributions, to imitate test-time shifts. This is most commonly achieved by instance-level re-weighting of the training objective to emulate the likelihood ratio with possible test distributions, which allows for estimating their empirical risk via importance sampling (assuming that they are subpopulations of the training distribution). However, re-weighting schemes in the literature are usually limited due to the difficulty of keeping the optimization problem tractable and the complexity of enforcing normalization constraints. In this paper, we show that three simple ideas – mini-batch level normalization, a KL penalty and simultaneous gradient updates – allow us to train models with DRO using a broader class of parametric likelihood ratios. In a series of experiments on both image and text classification benchmarks, we find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches, and that the method performs reliably well with little hyper-parameter tuning. Code to reproduce our experiments can be found at https://github.com/pmichel31415/P-DRO.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/06/2020

Adaptive Risk Minimization: A Meta-Learning Approach for Tackling Group Shift

A fundamental assumption of most machine learning algorithms is that the...
research
06/06/2023

On Pitfalls of Test-Time Adaptation

Test-Time Adaptation (TTA) has recently emerged as a promising approach ...
research
10/22/2021

Learning Proposals for Practical Energy-Based Regression

Energy-based models (EBMs) have experienced a resurgence within machine ...
research
09/20/2023

Dr. FERMI: A Stochastic Distributionally Robust Fair Empirical Risk Minimization Framework

While training fair machine learning models has been studied extensively...
research
07/20/2021

Characterizing Generalization under Out-Of-Distribution Shifts in Deep Metric Learning

Deep Metric Learning (DML) aims to find representations suitable for zer...
research
07/01/2021

Mandoline: Model Evaluation under Distribution Shift

Machine learning models are often deployed in different settings than th...
research
01/28/2022

Understanding Why Generalized Reweighting Does Not Improve Over ERM

Empirical risk minimization (ERM) is known in practice to be non-robust ...

Please sign up or login with your details

Forgot password? Click here to reset