The Implicit Regularization of Stochastic Gradient Flow for Least Squares

03/17/2020
by   Alnur Ali, et al.
8

We study the implicit regularization of mini-batch stochastic gradient descent, when applied to the fundamental problem of least squares regression. We leverage a continuous-time stochastic differential equation having the same moments as stochastic gradient descent, which we call stochastic gradient flow. We give a bound on the excess risk of stochastic gradient flow at time t, over ridge regression with tuning parameter λ = 1/t. The bound may be computed from explicit constants (e.g., the mini-batch size, step size, number of iterations), revealing precisely how these quantities drive the excess risk. Numerical examples show the bound can be small, indicating a tight relationship between the two estimators. We give a similar result relating the coefficients of stochastic gradient flow and ridge. These results hold under no conditions on the data matrix X, and across the entire optimization path (not just at convergence).

READ FULL TEXT
research
01/14/2022

The Implicit Regularization of Momentum Gradient Descent with Early Stopping

The study on the implicit regularization induced by gradient-based optim...
research
07/27/2020

Stochastic Gradient Descent applied to Least Squares regularizes in Sobolev spaces

We study the behavior of stochastic gradient descent applied to Ax -b _2...
research
04/14/2020

Strategic Investment in Energy Markets: A Multiparametric Programming Approach

An investor has to carefully select the location and size of new generat...
research
10/23/2018

A Continuous-Time View of Early Stopping for Least Squares Regression

We study the statistical properties of the iterates generated by gradien...
research
04/29/2022

Implicit Regularization Properties of Variance Reduced Stochastic Mirror Descent

In machine learning and statistical data analysis, we often run into obj...
research
07/17/2018

Learning with SGD and Random Features

Sketching and stochastic gradient methods are arguably the most common t...
research
10/21/2017

Optimal Rates for Learning with Nyström Stochastic Gradient Methods

In the setting of nonparametric regression, we propose and study a combi...

Please sign up or login with your details

Forgot password? Click here to reset