Better scalability under potentially heavy-tailed feedback

12/14/2020
by   Matthew J. Holland, et al.
0

We study scalable alternatives to robust gradient descent (RGD) techniques that can be used when the losses and/or gradients can be heavy-tailed, though this will be unknown to the learner. The core technique is simple: instead of trying to robustly aggregate gradients at each step, which is costly and leads to sub-optimal dimension dependence in risk bounds, we instead focus computational effort on robustly choosing (or newly constructing) a strong candidate based on a collection of cheap stochastic sub-processes which can be run in parallel. The exact selection process depends on the convexity of the underlying objective, but in all cases, our selection technique amounts to a robust form of boosting the confidence of weak learners. In addition to formal guarantees, we also provide empirical analysis of robustness to perturbations to experimental conditions, under both sub-Gaussian and heavy-tailed data, along with applications to a variety of benchmark datasets. The overall take-away is an extensible procedure that is simple to implement, trivial to parallelize, which keeps the formal merits of RGD methods but scales much better to large learning problems.

READ FULL TEXT
research
06/01/2020

Better scalability under potentially heavy-tailed gradients

We study a scalable alternative to robust gradient descent (RGD) techniq...
research
06/02/2020

Improved scalability under heavy tails, without strong convexity

Real-world data is laden with outlying values. The challenge for machine...
research
06/03/2020

Learning with CVaR-based feedback under potentially heavy tails

We study learning algorithms that seek to minimize the conditional value...
research
10/15/2018

Robust descent using smoothed multiplicative noise

To improve the off-sample generalization of classical procedures minimiz...
research
06/17/2020

Nearly Optimal Robust Method for Convex Compositional Problems with Heavy-Tailed Noise

In this paper, we propose robust stochastic algorithms for solving conve...
research
09/07/2023

Empirical Risk Minimization for Losses without Variance

This paper considers an empirical risk minimization problem under heavy-...
research
06/01/2017

Efficient learning with robust gradient descent

Minimizing the empirical risk is a popular training strategy, but for le...

Please sign up or login with your details

Forgot password? Click here to reset