Tackling Data Heterogeneity: A New Unified Framework for Decentralized SGD with Sample-induced Topology

by   Yan Huang, et al.

We develop a general framework unifying several gradient-based stochastic optimization methods for empirical risk minimization problems both in centralized and distributed scenarios. The framework hinges on the introduction of an augmented graph consisting of nodes modeling the samples and edges modeling both the inter-device communication and intra-device stochastic gradient computation. By designing properly the topology of the augmented graph, we are able to recover as special cases the renowned Local-SGD and DSGD algorithms, and provide a unified perspective for variance-reduction (VR) and gradient-tracking (GT) methods such as SAGA, Local-SVRG and GT-SAGA. We also provide a unified convergence analysis for smooth and (strongly) convex objectives relying on a proper structured Lyapunov function, and the obtained rate can recover the best known results for many existing algorithms. The rate results further reveal that VR and GT methods can effectively eliminate data heterogeneity within and across devices, respectively, enabling the exact convergence of the algorithm to the optimal solution. Numerical experiments confirm the findings in this paper.


page 1

page 2

page 3

page 4


Larger is Better: The Effect of Learning Rates Enjoyed by Stochastic Optimization with Progressive Variance Reduction

In this paper, we propose a simple variant of the original stochastic va...

A Unified Convergence Theorem for Stochastic Optimization Methods

In this work, we provide a fundamental unified convergence theorem used ...

A Unified Theory of Decentralized SGD with Changing Topology and Local Updates

Decentralized stochastic optimization methods have gained a lot of atten...

Variance Reduced EXTRA and DIGing and Their Optimal Acceleration for Strongly Convex Decentralized Optimization

We study stochastic decentralized optimization for the problem of traini...

Local SGD: Unified Theory and New Efficient Methods

We present a unified framework for analyzing local SGD methods in the co...

Removing Data Heterogeneity Influence Enhances Network Topology Dependence of Decentralized SGD

We consider decentralized stochastic optimization problems where a netwo...

One Method to Rule Them All: Variance Reduction for Data, Parameters and Many New Methods

We propose a remarkably general variance-reduced method suitable for sol...