Characterizing Implicit Bias in Terms of Optimization Geometry

02/22/2018
by   Suriya Gunasekar, et al.
0

We study the bias of generic optimization methods, including Mirror Descent, Natural Gradient Descent and Steepest Descent with respect to different potentials and norms, when optimizing underdetermined linear regression or separable linear classification problems. We ask the question of whether the global minimum (among the many possible global minima) reached by optimization algorithms can be characterized in terms of the potential or norm, and independently of hyperparameter choices such as step size and momentum.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/09/2019

The Implicit Bias of AdaGrad on Separable Data

We study the implicit bias of AdaGrad on separable linear classification...
research
05/27/2023

Faster Margin Maximization Rates for Generic Optimization Methods

First-order optimization methods tend to inherently favor certain soluti...
research
10/06/2020

A Unifying View on Implicit Bias in Training Linear Neural Networks

We study the implicit bias of gradient flow (i.e., gradient descent with...
research
06/24/2023

A Unified Approach to Controlling Implicit Regularization via Mirror Descent

Inspired by the remarkable success of deep neural networks, there has be...
research
10/28/2022

Flatter, faster: scaling momentum for optimal speedup of SGD

Commonly used optimization algorithms often show a trade-off between goo...
research
10/08/2021

Momentum Doesn't Change the Implicit Bias

The momentum acceleration technique is widely adopted in many optimizati...
research
04/29/2022

The Directional Bias Helps Stochastic Gradient Descent to Generalize in Kernel Regression Models

We study the Stochastic Gradient Descent (SGD) algorithm in nonparametri...

Please sign up or login with your details

Forgot password? Click here to reset