-
DeepSqueeze: Decentralization Meets Error-Compensated Compression
Communication is a key bottleneck in distributed training. Recently, an ...
read it
-
DeepSqueeze: Parallel Stochastic Gradient Descent with Double-Pass Error-Compensated Compression
Communication is a key bottleneck in distributed training. Recently, an ...
read it
-
D-SPIDER-SFO: A Decentralized Optimization Algorithm with Faster Convergence Rate for Nonconvex Problems
Decentralized optimization algorithms have attracted intensive interests...
read it
-
APMSqueeze: A Communication Efficient Adam-Preconditioned Momentum SGD Algorithm
Adam is the important optimization algorithm to guarantee efficiency and...
read it
-
Communication-Efficient Network-Distributed Optimization with Differential-Coded Compressors
Network-distributed optimization has attracted significant attention in ...
read it
-
On the Discrepancy between the Theoretical Analysis and Practical Implementations of Compressed Communication for Distributed Deep Learning
Compressed communication, in the form of sparsification or quantization ...
read it
-
Communication Efficient Sparsification for Large Scale Machine Learning
The increasing scale of distributed learning problems necessitates the d...
read it
Linear Convergent Decentralized Optimization with Compression
Communication compression has been extensively adopted to speed up large-scale distributed optimization. However, most existing decentralized algorithms with compression are unsatisfactory in terms of convergence rate and stability. In this paper, we delineate two key obstacles in the algorithm design – data heterogeneity and compression error. Our attempt to explicitly overcome these obstacles leads to a novel decentralized algorithm named LEAD. This algorithm is the first LinEAr convergent Decentralized algorithm with communication compression. Our theory describes the coupled dynamics of the inaccurate model propagation and optimization process. We also provide the first consensus error bound without assuming bounded gradients. Empirical experiments validate our theoretical analysis and show that the proposed algorithm achieves state-of-the-art computation and communication efficiency.
READ FULL TEXT
Comments
There are no comments yet.