A Unified and Refined Convergence Analysis for Non-Convex Decentralized Learning

10/19/2021
by   Sulaiman A. Alghunaim, et al.
0

We study the consensus decentralized optimization problem where the objective function is the average of n agents private non-convex cost functions; moreover, the agents can only communicate to their neighbors on a given network topology. The stochastic online setting is considered in this paper where each agent can only access a noisy estimate of its gradient. Many decentralized methods can solve such problems including EXTRA, Exact-Diffusion/D^2, and gradient-tracking. Unlike the famed DSGD algorithm, these methods have been shown to be robust to the heterogeneity of the local cost functions. However, the established convergence rates for these methods indicate that their sensitivity to the network topology is worse than DSGD. Such theoretical results imply that these methods can perform much worse than DSGD over sparse networks, which, however, contradicts empirical experiments where DSGD is observed to be more sensitive to the network topology. In this work, we study a general stochastic unified decentralized algorithm (SUDA) that includes the above methods as special cases. We establish the convergence of SUDA under both non-convex and the Polyak-Lojasiewicz condition settings. Our results provide improved network topology dependent bounds for these methods (such as Exact-Diffusion/D^2 and gradient-tracking) compared with existing literature. Moreover, our result shows that these method are less sensitive to the network topology compared to DSGD, which agrees with numerical experiments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/17/2021

Removing Data Heterogeneity Influence Enhances Network Topology Dependence of Decentralized SGD

We consider decentralized stochastic optimization problems where a netwo...
research
08/25/2021

Decentralized optimization with non-identical sampling in presence of stragglers

We consider decentralized consensus optimization when workers sample dat...
research
02/08/2022

An Improved Analysis of Gradient Tracking for Decentralized Machine Learning

We consider decentralized machine learning over a network where the trai...
research
10/10/2022

On the Performance of Gradient Tracking with Local Updates

We study the decentralized optimization problem where a network of n age...
research
06/14/2021

A scalable multi-step least squares method for network identification with unknown disturbance topology

Identification methods for dynamic networks typically require prior know...
research
07/08/2022

Tackling Data Heterogeneity: A New Unified Framework for Decentralized SGD with Sample-induced Topology

We develop a general framework unifying several gradient-based stochasti...
research
04/09/2022

Yes, Topology Matters in Decentralized Optimization: Refined Convergence and Topology Learning under Heterogeneous Data

One of the key challenges in federated and decentralized learning is to ...

Please sign up or login with your details

Forgot password? Click here to reset