DORO: Distributional and Outlier Robust Optimization

06/11/2021
by   Runtian Zhai, et al.
0

Many machine learning tasks involve subpopulation shift where the testing data distribution is a subpopulation of the training distribution. For such settings, a line of recent work has proposed the use of a variant of empirical risk minimization(ERM) known as distributionally robust optimization (DRO). In this work, we apply DRO to real, large-scale tasks with subpopulation shift, and observe that DRO performs relatively poorly, and moreover has severe instability. We identify one direct cause of this phenomenon: sensitivity of DRO to outliers in the datasets. To resolve this issue, we propose the framework of DORO, for Distributional and Outlier Robust Optimization. At the core of this approach is a refined risk function which prevents DRO from overfitting to potential outliers. We instantiate DORO for the Cressie-Read family of Rényi divergence, and delve into two specific instances of this family: CVaR and χ^2-DRO. We theoretically prove the effectiveness of the proposed method, and empirically show that DORO improves the performance and stability of DRO with experiments on large modern datasets, thereby positively addressing the open question raised by Hashimoto et al., 2018.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/02/2021

Functional outlier detection for density-valued data with application to robustify distribution to distribution regression

Distributional data analysis, concerned with statistical analysis and mo...
research
01/28/2022

Understanding Why Generalized Reweighting Does Not Improve Over ERM

Empirical risk minimization (ERM) is known in practice to be non-robust ...
research
07/17/2021

BEDS-Bench: Behavior of EHR-models under Distributional Shift–A Benchmark

Machine learning has recently demonstrated impressive progress in predic...
research
06/30/2021

Robust Coreset for Continuous-and-Bounded Learning (with Outliers)

In this big data era, we often confront large-scale data in many machine...
research
08/17/2023

Environment Diversification with Multi-head Neural Network for Invariant Learning

Neural networks are often trained with empirical risk minimization; howe...
research
11/19/2015

Robust Classification by Pre-conditioned LASSO and Transductive Diffusion Component Analysis

Modern machine learning-based recognition approaches require large-scale...
research
12/08/2022

DC-MBR: Distributional Cooling for Minimum Bayesian Risk Decoding

Minimum Bayesian Risk Decoding (MBR) emerges as a promising decoding alg...

Please sign up or login with your details

Forgot password? Click here to reset