Momentum-Based Variance Reduction in Non-Convex SGD

05/24/2019
by   Ashok Cutkosky, et al.
0

Variance reduction has emerged in recent years as a strong competitor to stochastic gradient descent in non-convex problems, providing the first algorithms to improve upon the converge rate of stochastic gradient descent for finding first-order critical points. However, variance reduction techniques typically require carefully tuned learning rates and willingness to use excessively large "mega-batches" in order to achieve their improved results. We present a new variance reduction algorithm, STORM, that does not require any batches and makes use of adaptive learning rates, enabling simpler implementation and less tuning of hyperparameters. Our technique for removing the batches uses a variant of momentum to achieve variance reduction in non-convex optimization. On smooth losses F, STORM finds a point x with E[∇ F(x)]< O(1/√(T)+σ^1/3/T^1/3) in T iterations with σ^2 variance in the gradients, matching the optimal rate but without requiring knowledge of σ.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/17/2016

Variance Reduction for Faster Non-Convex Optimization

We consider the fundamental problem in non-convex optimization of effici...
research
01/25/2019

Surrogate Losses for Online Learning of Stepsizes in Stochastic Non-Convex Optimization

Stochastic Gradient Descent (SGD) has played a central role in machine l...
research
11/01/2021

STORM+: Fully Adaptive SGD with Momentum for Nonconvex Optimization

In this work we investigate stochastic non-convex optimization problems ...
research
05/02/2018

SVRG meets SAGA: k-SVRG --- A Tale of Limited Memory

In recent years, many variance reduced algorithms for empirical risk min...
research
03/04/2021

Correcting Momentum with Second-order Information

We develop a new algorithm for non-convex stochastic optimization that f...
research
05/01/2020

Distributed Stochastic Non-Convex Optimization: Momentum-Based Variance Reduction

In this work, we propose a distributed algorithm for stochastic non-conv...
research
10/12/2022

Momentum Aggregation for Private Non-convex ERM

We introduce new algorithms and convergence guarantees for privacy-prese...

Please sign up or login with your details

Forgot password? Click here to reset