Iterate averaging as regularization for stochastic gradient descent

02/22/2018
by   Gergely Neu, et al.
0

We propose and analyze a variant of the classic Polyak-Ruppert averaging scheme, broadly used in stochastic gradient methods. Rather than a uniform average of the iterates, we consider a weighted average, with weights decaying in a geometric fashion. In the context of linear least squares regression, we show that this averaging scheme has a the same regularizing effect, and indeed is asymptotically equivalent, to ridge regression. In particular, we derive finite-sample bounds for the proposed approach that match the best known results for regularized stochastic gradient methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/09/2023

Asymptotically efficient one-step stochastic gradient descent

A generic, fast and asymptotically efficient method for parametric estim...
research
03/08/2016

Stochastic dual averaging methods using variance reduction techniques for regularized empirical risk minimization problems

We consider a composite convex minimization problem associated with regu...
research
04/16/2018

Constant Step Size Stochastic Gradient Descent for Probabilistic Modeling

Stochastic gradient methods enable learning probabilistic models from la...
research
08/15/2020

Obtaining Adjustable Regularization for Free via Iterate Averaging

Regularization for optimization is a crucial technique to avoid overfitt...
research
04/30/2014

Learning with incremental iterative regularization

Within a statistical learning setting, we propose and study an iterative...
research
12/04/2020

A Variant of Gradient Descent Algorithm Based on Gradient Averaging

In this work, we study an optimizer, Grad-Avg to optimize error function...
research
11/23/2017

Online and Batch Supervised Background Estimation via L1 Regression

We propose a surprisingly simple model for supervised video background e...

Please sign up or login with your details

Forgot password? Click here to reset