An improved convergence analysis for decentralized online stochastic non-convex optimization

by   Ran Xin, et al.

In this paper, we study decentralized online stochastic non-convex optimization over a network of nodes. Integrating a technique called gradient tracking in decentralized stochastic gradient descent (DSGD), we show that the resulting algorithm, GT-DSGD, exhibits several important characteristics towards minimizing a sum of smooth non-convex functions. The main results of this paper can be divided into two categories: (1) For general smooth non-convex functions, we establish a non-asymptotic characterization of GT-DSGD and derive the conditions under which it achieves network-independent performance and matches centralized minibatch SGD. In comparison, the existing results suggest that the performance of GT-DSGD is always network-dependent and is therefore strictly worse than that of centralized minibatch SGD. (2) When the global function additionally satisfies the Polyak-Lojasiewics condition, we derive the exponential stability range for GT-DSGD under a constant step-size up to a steady-state error. Under stochastic approximation step-sizes, we establish, for the first time, the optimal global sublinear convergence rate on almost every sample path, in addition to the convergence rate in mean. Since strongly convex functions are a special case of this class of problems, our results are not only immediately applicable but also improve the currently known best convergence rates and their dependence on problem parameters.


Decentralized Riemannian Gradient Descent on the Stiefel Manifold

We consider a distributed non-convex optimization where a network of age...

Online Stochastic Gradient Descent with Arbitrary Initialization Solves Non-smooth, Non-convex Phase Retrieval

In recent literature, a general two step procedure has been formulated f...

Revisiting Optimal Convergence Rate for Smooth and Non-convex Stochastic Decentralized Optimization

Decentralized optimization is effective to save communication in large-s...

Optimal rates for zero-order convex optimization: the power of two function evaluations

We consider derivative-free algorithms for stochastic and non-stochastic...

Decentralized Stochastic Gradient Tracking for Non-convex Empirical Risk Minimization

This paper studies a decentralized stochastic gradient tracking (DSGT) a...

On the Convergence of Consensus Algorithms with Markovian Noise and Gradient Bias

This paper presents a finite time convergence analysis for a decentraliz...

A near-optimal stochastic gradient method for decentralized non-convex finite-sum optimization

This paper describes a near-optimal stochastic first-order gradient meth...

Please sign up or login with your details

Forgot password? Click here to reset