Stochastic Polyak Stepsize with a Moving Target

by   Robert M. Gower, et al.

We propose a new stochastic gradient method that uses recorded past loss values to reduce the variance. Our method can be interpreted as a new stochastic variant of the Polyak Stepsize that converges globally without assuming interpolation. Our method introduces auxiliary variables, one for each data point, that track the loss value for each data point. We provide a global convergence theory for our method by showing that it can be interpreted as a special variant of online SGD. The new method only stores a single scalar per data point, opening up new applications for variance reduction where memory is the bottleneck.


page 4

page 8

page 33

page 38


Variance Reduced Stochastic Gradient Descent with Neighbors

Stochastic Gradient Descent (SGD) is a workhorse in machine learning, ye...

Sketched Newton-Raphson

We propose a new globally convergent stochastic second order method. Our...

SVAG: Unified Convergence Results for SAG-SAGA Interpolation with Stochastic Variance Adjusted Gradient Descent

We analyze SVAG, a variance reduced stochastic gradient method with SAG ...

Exponentially convergent stochastic k-PCA without variance reduction

We present Matrix Krasulina, an algorithm for online k-PCA, by generaliz...

Variance reduction for effective energies of random lattices in the Thomas-Fermi-von Weizsäcker model

In the computation of the material properties of random alloys, the meth...

Plug-In Stochastic Gradient Method

Plug-and-play priors (PnP) is a popular framework for regularized signal...