Near-Optimal High-Probability Convergence for Non-Convex Stochastic Optimization with Variance Reduction

02/13/2023
by   Zijian Liu, et al.
0

Traditional analyses for non-convex stochastic optimization problems characterize convergence bounds in expectation, which is inadequate as it does not supply a useful performance guarantee on a single run. Motivated by its importance, an emerging line of literature has recently studied the high-probability convergence behavior of several algorithms, including the classic stochastic gradient descent (SGD). However, no high-probability results are established for optimization algorithms with variance reduction, which is known to accelerate the convergence process and has been the de facto algorithmic technique for stochastic optimization at large. To close this important gap, we introduce a new variance-reduced algorithm for non-convex stochastic optimization, which we call Generalized SignSTORM. We show that with probability at least 1-δ, our algorithm converges at the rate of O(log(dT/δ)/T^1/3) after T iterations where d is the problem dimension. This convergence guarantee matches the existing lower bound up to a log factor, and to our best knowledge, is the first high-probability minimax (near-)optimal result. Finally, we demonstrate the effectiveness of our algorithm through numerical experiments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/14/2023

Breaking the Lower Bound with (Little) Structure: Acceleration in Non-Convex Stochastic Optimization with Heavy-Tailed Noise

We consider the stochastic optimization problem with smooth but not nece...
research
07/20/2023

Convergence of Adam for Non-convex Objectives: Relaxed Hyperparameters and Non-ergodic Case

Adam is a commonly used stochastic optimization algorithm in machine lea...
research
10/11/2022

Divergence Results and Convergence of a Variance Reduced Version of ADAM

Stochastic optimization algorithms using exponential moving averages of ...
research
05/24/2011

Ergodic Mirror Descent

We generalize stochastic subgradient descent methods to situations in wh...
research
04/12/2021

An Efficient Algorithm for Deep Stochastic Contextual Bandits

In stochastic contextual bandit (SCB) problems, an agent selects an acti...
research
02/10/2020

Stochastic Online Optimization using Kalman Recursion

We study the Extended Kalman Filter in constant dynamics, offering a bay...
research
02/17/2023

SGD with AdaGrad Stepsizes: Full Adaptivity with High Probability to Unknown Parameters, Unbounded Gradients and Affine Variance

We study Stochastic Gradient Descent with AdaGrad stepsizes: a popular a...

Please sign up or login with your details

Forgot password? Click here to reset