Log In Sign Up

D-SPIDER-SFO: A Decentralized Optimization Algorithm with Faster Convergence Rate for Nonconvex Problems

by   Taoxing Pan, et al.

Decentralized optimization algorithms have attracted intensive interests recently, as it has a balanced communication pattern, especially when solving large-scale machine learning problems. Stochastic Path Integrated Differential Estimator Stochastic First-Order method (SPIDER-SFO) nearly achieves the algorithmic lower bound in certain regimes for nonconvex problems. However, whether we can find a decentralized algorithm which achieves a similar convergence rate to SPIDER-SFO is still unclear. To tackle this problem, we propose a decentralized variant of SPIDER-SFO, called decentralized SPIDER-SFO (D-SPIDER-SFO). We show that D-SPIDER-SFO achieves a similar gradient computation cost—that is, O(ϵ^-3) for finding an ϵ-approximate first-order stationary point—to its centralized counterpart. To the best of our knowledge, D-SPIDER-SFO achieves the state-of-the-art performance for solving nonconvex optimization problems on decentralized networks in terms of the computational cost. Experiments on different network configurations demonstrate the efficiency of the proposed method.


page 1

page 2

page 3

page 4


Convergence Analysis of Nonconvex Distributed Stochastic Zeroth-order Coordinate Method

This paper investigates the stochastic distributed nonconvex optimizatio...

Towards More Efficient Stochastic Decentralized Learning: Faster Convergence and Sparse Communication

Recently, the decentralized optimization problem is attracting growing a...

Decentralized Stochastic First-Order Methods for Large-scale Machine Learning

Decentralized consensus-based optimization is a general computational fr...

GT-STORM: Taming Sample, Communication, and Memory Complexities in Decentralized Non-Convex Learning

Decentralized nonconvex optimization has received increasing attention i...

Fast and Robust Sparsity Learning over Networks: A Decentralized Surrogate Median Regression Approach

Decentralized sparsity learning has attracted a significant amount of at...

BEER: Fast O(1/T) Rate for Decentralized Nonconvex Optimization with Communication Compression

Communication efficiency has been widely recognized as the bottleneck fo...

Second-Order Guarantees in Centralized, Federated and Decentralized Nonconvex Optimization

Rapid advances in data collection and processing capabilities have allow...