Distributionally Robust Losses for Latent Covariate Mixtures

07/28/2020
by   John Duchi, et al.
0

While modern large-scale datasets often consist of heterogeneous subpopulations—for example, multiple demographic groups or multiple text corpora—the standard practice of minimizing average loss fails to guarantee uniformly low losses across all subpopulations. We propose a convex procedure that controls the worst-case performance over all subpopulations of a given size. Our procedure comes with finite-sample (nonparametric) convergence guarantees on the worst-off subpopulation. Empirically, we observe on lexical similarity, wine quality, and recidivism prediction tasks that our worst-case procedure learns models that do well against unseen subpopulations.

READ FULL TEXT
research
05/26/2021

A data-driven approach to beating SAA out-of-sample

While solutions of Distributionally Robust Optimization (DRO) problems c...
research
07/26/2020

Beyond the Worst-Case Analysis of Algorithms (Introduction)

One of the primary goals of the mathematical analysis of algorithms is t...
research
10/12/2021

Balancing Average and Worst-case Accuracy in Multitask Learning

When training and evaluating machine learning models on a large number o...
research
01/03/2013

Follow the Leader If You Can, Hedge If You Must

Follow-the-Leader (FTL) is an intuitive sequential prediction strategy t...
research
11/02/2018

Worst-Case Efficient Sorting with QuickMergesort

The two most prominent solutions for the sorting problem are Quicksort a...
research
07/09/2021

Training Over-parameterized Models with Non-decomposable Objectives

Many modern machine learning applications come with complex and nuanced ...
research
05/19/2022

What killed the Convex Booster ?

A landmark negative result of Long and Servedio established a worst-case...

Please sign up or login with your details

Forgot password? Click here to reset