Learning with Non-Convex Truncated Losses by SGD

05/21/2018
by   Yi Xu, et al.
0

Learning with a convex loss function has been a dominating paradigm for many years. It remains an interesting question how non-convex loss functions help improve the generalization of learning with broad applicability. In this paper, we study a family of objective functions formed by truncating traditional loss functions, which is applicable to both shallow learning and deep learning. Truncating loss functions has potential to be less vulnerable and more robust to large noise in observations that could be adversarial. More importantly, it is a generic technique without assuming the knowledge of noise distribution. To justify non-convex learning with truncated losses, we establish excess risk bounds of empirical risk minimization based on truncated losses for heavy-tailed output, and statistical error of an approximate stationary point found by stochastic gradient descent (SGD) method. Our experiments for shallow and deep learning for regression with outliers, corrupted data and heavy-tailed noise further justify the proposed method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/27/2023

Algorithmic Stability of Heavy-Tailed SGD with General Loss Functions

Heavy-tail phenomena in stochastic gradient descent (SGD) have been repo...
research
01/10/2022

Non-Asymptotic Guarantees for Robust Statistical Learning under (1+ε)-th Moment Assumption

There has been a surge of interest in developing robust estimators for m...
research
09/29/2020

A Framework of Learning Through Empirical Gain Maximization

We develop in this paper a framework of empirical gain maximization (EGM...
research
03/13/2023

General Loss Functions Lead to (Approximate) Interpolation in High Dimensions

We provide a unified framework, applicable to a general family of convex...
research
02/08/2021

Eliminating Sharp Minima from SGD with Truncated Heavy-tailed Noise

The empirical success of deep learning is often attributed to SGD's myst...
research
09/07/2023

Empirical Risk Minimization for Losses without Variance

This paper considers an empirical risk minimization problem under heavy-...
research
06/11/2020

Non-Convex SGD Learns Halfspaces with Adversarial Label Noise

We study the problem of agnostically learning homogeneous halfspaces in ...

Please sign up or login with your details

Forgot password? Click here to reset