Efficient learning with robust gradient descent

06/01/2017
by   Matthew J. Holland, et al.
0

Minimizing the empirical risk is a popular training strategy, but for learning tasks where the data may be noisy or heavy-tailed, one may require many observations in order to generalize well. To achieve better performance under less stringent requirements, we introduce a procedure which constructs a robust approximation of the risk gradient for use in an iterative learning routine. We provide high-probability bounds on the excess risk of this algorithm, by showing that it does not deviate far from the ideal gradient-based update. Empirical tests show that in diverse settings, the proposed procedure can learn more efficiently, using less resources (iterations and observations) while generalizing better.

READ FULL TEXT
research
10/15/2018

Robust descent using smoothed multiplicative noise

To improve the off-sample generalization of classical procedures minimiz...
research
09/07/2023

Empirical Risk Minimization for Losses without Variance

This paper considers an empirical risk minimization problem under heavy-...
research
05/24/2023

Taylor Learning

Empirical risk minimization stands behind most optimization in supervise...
research
06/01/2020

Better scalability under potentially heavy-tailed gradients

We study a scalable alternative to robust gradient descent (RGD) techniq...
research
01/31/2022

Robust supervised learning with coordinate gradient descent

This paper considers the problem of supervised learning with linear meth...
research
06/02/2020

Improved scalability under heavy tails, without strong convexity

Real-world data is laden with outlying values. The challenge for machine...
research
12/14/2020

Better scalability under potentially heavy-tailed feedback

We study scalable alternatives to robust gradient descent (RGD) techniqu...

Please sign up or login with your details

Forgot password? Click here to reset