Barzilai-Borwein Step Size for Stochastic Gradient Descent

05/13/2016
by   Conghui Tan, et al.
0

One of the major issues in stochastic gradient descent (SGD) methods is how to choose an appropriate step size while running the algorithm. Since the traditional line search technique does not apply for stochastic optimization algorithms, the common practice in SGD is either to use a diminishing step size, or to tune a fixed step size by hand, which can be time consuming in practice. In this paper, we propose to use the Barzilai-Borwein (BB) method to automatically compute step sizes for SGD and its variant: stochastic variance reduced gradient (SVRG) method, which leads to two algorithms: SGD-BB and SVRG-BB. We prove that SVRG-BB converges linearly for strongly convex objective functions. As a by-product, we prove the linear convergence result of SVRG with Option I proposed in [10], whose convergence result is missing in the literature. Numerical experiments on standard data sets show that the performance of SGD-BB and SVRG-BB is comparable to and sometimes even better than SGD and SVRG with best-tuned step sizes, and is superior to some advanced SGD variants.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/21/2021

Towards Noise-adaptive, Problem-adaptive Stochastic Gradient Descent

We design step-size schemes that make stochastic gradient descent (SGD) ...
research
05/24/2019

Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates

Recent works have shown that stochastic gradient descent (SGD) achieves ...
research
06/20/2019

Accelerating Mini-batch SARAH by Step Size Rules

StochAstic Recursive grAdient algoritHm (SARAH), originally proposed for...
research
11/08/2015

Speed learning on the fly

The practical performance of online stochastic gradient descent algorith...
research
01/18/2018

When Does Stochastic Gradient Algorithm Work Well?

In this paper, we consider a general stochastic optimization problem whi...
research
11/25/2018

The promises and pitfalls of Stochastic Gradient Langevin Dynamics

Stochastic Gradient Langevin Dynamics (SGLD) has emerged as a key MCMC a...
research
10/11/2022

SGD with large step sizes learns sparse features

We showcase important features of the dynamics of the Stochastic Gradien...

Please sign up or login with your details

Forgot password? Click here to reset