General Loss Functions Lead to (Approximate) Interpolation in High Dimensions

03/13/2023
by   Kuo-Wei Lai, et al.
0

We provide a unified framework, applicable to a general family of convex losses and across binary and multiclass settings in the overparameterized regime, to approximately characterize the implicit bias of gradient descent in closed form. Specifically, we show that the implicit bias is approximated (but not exactly equal to) the minimum-norm interpolation in high dimensions, which arises from training on the squared loss. In contrast to prior work which was tailored to exponentially-tailed losses and used the intermediate support-vector-machine formulation, our framework directly builds on the primal-dual analysis of Ji and Telgarsky (2021), allowing us to provide new approximate equivalences for general convex losses through a novel sensitivity analysis. Our framework also recovers existing exact equivalence results for exponentially-tailed losses across binary and multiclass settings. Finally, we provide evidence for the tightness of our techniques, which we use to demonstrate the effect of certain loss functions designed for out-of-distribution problems on the closed-form solution.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/19/2020

Gradient descent follows the regularization path for general losses

Recent work across many machine learning disciplines has highlighted tha...
research
05/24/2018

Learning Classifiers with Fenchel-Young Losses: Generalized Entropies, Margins, and Algorithms

We study in this paper Fenchel-Young losses, a generic way to construct ...
research
05/21/2018

Learning with Non-Convex Truncated Losses by SGD

Learning with a convex loss function has been a dominating paradigm for...
research
05/22/2023

Fast Convergence in Learning Two-Layer Neural Networks with Separable Data

Normalized gradient descent has shown substantial success in speeding up...
research
09/01/2022

The Geometry and Calculus of Losses

Statistical decision problems are the foundation of statistical machine ...
research
06/07/2015

Primal Method for ERM with Flexible Mini-batching Schemes and Non-convex Losses

In this work we develop a new algorithm for regularized empirical risk m...
research
05/24/2017

Learning with Average Top-k Loss

In this work, we introduce the average top-k (AT_k) loss as a new ensemb...

Please sign up or login with your details

Forgot password? Click here to reset