Last-iterate convergence analysis of stochastic momentum methods for neural networks

05/30/2022
by   Dongpo Xu, et al.
0

The stochastic momentum method is a commonly used acceleration technique for solving large-scale stochastic optimization problems in artificial neural networks. Current convergence results of stochastic momentum methods under non-convex stochastic settings mostly discuss convergence in terms of the random output and minimum output. To this end, we address the convergence of the last iterate output (called last-iterate convergence) of the stochastic momentum methods for non-convex stochastic optimization problems, in a way conformal with traditional optimization theory. We prove the last-iterate convergence of the stochastic momentum methods under a unified framework, covering both stochastic heavy ball momentum and stochastic Nesterov accelerated gradient momentum. The momentum factors can be fixed to be constant, rather than time-varying coefficients in existing analyses. Finally, the last-iterate convergence of the stochastic momentum methods is verified on the benchmark MNIST and CIFAR-10 datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/10/2018

On the Convergence of AdaGrad with Momentum for Training Deep Neural Networks

Adaptive stochastic gradient descent methods, such as AdaGrad, Adam, Ada...
research
07/02/2021

Momentum Accelerates the Convergence of Stochastic AUPRC Maximization

In this paper, we study stochastic optimization of areas under precision...
research
07/02/2019

The Role of Memory in Stochastic Optimization

The choice of how to retain information about past gradients dramaticall...
research
07/18/2018

Convergence guarantees for RMSProp and ADAM in non-convex optimization and their comparison to Nesterov acceleration on autoencoders

RMSProp and ADAM continue to be extremely popular algorithms for trainin...
research
05/30/2019

Exploiting Uncertainty of Loss Landscape for Stochastic Optimization

We introduce novel variants of momentum by incorporating the variance of...
research
05/10/2021

A Bregman Learning Framework for Sparse Neural Networks

We propose a learning framework based on stochastic Bregman iterations t...
research
12/31/2019

Stochastic gradient-free descents

In this paper we propose stochastic gradient-free methods and gradient-f...

Please sign up or login with your details

Forgot password? Click here to reset