Momentum-based variance-reduced proximal stochastic gradient method for composite nonconvex stochastic optimization

05/31/2020
by   Yangyang Xu, et al.
0

Stochastic gradient methods (SGMs) have been extensively used for solving stochastic problems or large-scale machine learning problems. Recent works employ various techniques to improve the convergence rate of SGMs for both convex and nonconvex cases. Most of them require a large number of samples in some or all iterations of the improved SGMs. In this paper, we propose a new SGM, named PStorm, for solving nonconvex nonsmooth stochastic problems. With a momentum-based variance reduction technique, PStorm can achieve a near-optimal complexity result Õ(ε^-3) to produce a stochastic ε-stationary solution, if a mean-squared smoothness condition holds. Different from existing near-optimal methods, PStorm requires only one or O(1) samples in every update. With this property, PStorm can be applied to online learning problems that favor real-time decisions based on one or O(1) new observations. In addition, for large-scale machine learning problems, PStorm can generalize better by small-batch training than other near-optimal methods that require large-batch training and the vanilla SGM, as we demonstrate on training a sparse fully-connected neural network.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/07/2019

Momentum Schemes with Stochastic Variance Reduction for Nonconvex Composite Optimization

Two new stochastic variance-reduced algorithms named SARAH and SPIDER ha...
research
05/15/2020

Momentum with Variance Reduction for Nonconvex Composition Optimization

Composition optimization is widely-applied in nonconvex machine learning...
research
03/23/2017

Fast Stochastic Variance Reduced Gradient Method with Momentum Acceleration for Machine Learning

Recently, research on accelerated stochastic gradient descent methods (e...
research
08/20/2020

An Optimal Hybrid Variance-Reduced Algorithm for Stochastic Composite Nonconvex Optimization

In this note we propose a new variant of the hybrid variance-reduced pro...
research
05/09/2016

Nonconvex Sparse Learning via Stochastic Optimization with Progressive Variance Reduction

We propose a stochastic variance reduced optimization algorithm for solv...
research
01/30/2023

Distributed Stochastic Optimization under a General Variance Condition

Distributed stochastic optimization has drawn great attention recently d...
research
08/07/2018

Fast Variance Reduction Method with Stochastic Batch Size

In this paper we study a family of variance reduction methods with rando...

Please sign up or login with your details

Forgot password? Click here to reset