Convergence and Complexity of Stochastic Subgradient Methods with Dependent Data for Nonconvex Optimization

03/29/2022
by   Ahmet Alacaoglu, et al.
0

We show that under a general dependent data sampling scheme, the classical stochastic projected and proximal subgradient methods for weakly convex functions have worst-case rate of convergence Õ(n^-1/4) and complexity Õ(ε^-4) for achieving an ε-near stationary point in terms of the norm of the gradient of Moreau envelope. While classical convergence guarantee requires i.i.d. data sampling from the target distribution, we only require a mild mixing condition of the conditional distribution, which holds for a wide class of Markov chain sampling algorithms. This improves the existing complexity for the specific case of constrained smooth nonconvex optimization with dependent data from Õ(ε^-8) to Õ(ε^-4) with a significantly simpler analysis. We illustrate the generality of our approach by deriving convergence results with dependent data for adaptive stochastic subgradient algorithm AdaGrad and stochastic subgradient algorithm with heavy ball momentum. As an application, we obtain first online nonnegative matrix factorization algorithms for dependent data based on stochastic projected gradient methods with adaptive step sizes with optimal rate of convergence guarantee.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/23/2016

Fast Stochastic Methods for Nonsmooth Nonconvex Optimization

We analyze stochastic algorithms for optimizing nonconvex, nonsmooth fin...
research
12/01/2020

Convergence of Gradient Algorithms for Nonconvex C^1+α Cost Functions

This paper is concerned with convergence of stochastic gradient algorith...
research
08/16/2018

On the Convergence of Adaptive Gradient Methods for Nonconvex Optimization

Adaptive gradient methods are workhorses in deep learning. However, the ...
research
06/11/2020

Convergence of adaptive algorithms for weakly convex constrained optimization

We analyze the adaptive first order algorithm AMSGrad, for solving a con...
research
08/08/2018

On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization

This paper studies a class of adaptive gradient based momentum algorithm...
research
03/31/2017

Catalyst Acceleration for Gradient-Based Non-Convex Optimization

We introduce a generic scheme to solve nonconvex optimization problems u...
research
06/04/2023

Complexity of Block Coordinate Descent with Proximal Regularization and Applications to Wasserstein CP-dictionary Learning

We consider the block coordinate descent methods of Gauss-Seidel type wi...

Please sign up or login with your details

Forgot password? Click here to reset