Asynchronous Stochastic Proximal Methods for Nonconvex Nonsmooth Optimization

02/24/2018
by   Rui Zhu, et al.
0

We study stochastic algorithms for solving non-convex optimization problems with a convex yet possibly non-smooth regularizer, which find wide applications in many practical machine learning applications. However, compared to asynchronous parallel stochastic gradient descent (AsynSGD), an algorithm targeting smooth optimization, the understanding of the behavior of stochastic algorithms for the non-smooth regularized optimization problems is limited, especially when the objective function is non-convex. To fill this gap, in this paper, we propose and analyze asynchronous parallel stochastic proximal gradient (AsynSPG) methods, including a full update version and a block-wise version, for non-convex problems. We establish an ergodic convergence rate of O(1/√(K)) for the proposed AsynSPG, K being the number of updates made on the model, matching the convergence rate currently known for AsynSGD (for smooth problems). To our knowledge, this is the first work that provides convergence rates of asynchronous parallel SPG algorithms for non-convex problems. Furthermore, our results are also the first to prove convergence of any stochastic proximal methods without assuming an increasing batch size or the use of additional variance reduction techniques. We implement the proposed algorithms on Parameter Server and demonstrate its convergence behavior and near-linear speedup, as the number of workers increases, for sparse learning problems on a real-world dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/19/2018

A Model Parallel Proximal Stochastic Gradient Algorithm for Partially Asynchronous Systems

Large models are prevalent in modern machine learning scenarios, includi...
research
07/15/2020

A General Family of Stochastic Proximal Gradient Methods for Deep Learning

We study the training of regularized neural networks where the regulariz...
research
02/24/2018

A Block-wise, Asynchronous and Distributed ADMM Algorithm for General Form Consensus Optimization

Many machine learning models, including those with non-smooth regularize...
research
07/24/2021

Distributed stochastic inertial methods with delayed derivatives

Stochastic gradient methods (SGMs) are predominant approaches for solvin...
research
02/21/2020

Asynchronous parallel adaptive stochastic gradient methods

Stochastic gradient methods (SGMs) are the predominant approaches to tra...
research
07/21/2019

Distributed Inexact Successive Convex Approximation ADMM: Analysis-Part I

In this two-part work, we propose an algorithmic framework for solving n...
research
11/02/2020

Asynchronous Parallel Stochastic Quasi-Newton Methods

Although first-order stochastic algorithms, such as stochastic gradient ...

Please sign up or login with your details

Forgot password? Click here to reset