Communication-efficient Variance-reduced Stochastic Gradient Descent

by   Hossein S. Ghadikolaei, et al.

We consider the problem of communication efficient distributed optimization where multiple nodes exchange important algorithm information in every iteration to solve large problems. In particular, we focus on the stochastic variance-reduced gradient and propose a novel approach to make it communication-efficient. That is, we compress the communicated information to a few bits while preserving the linear convergence rate of the original uncompressed algorithm. Comprehensive theoretical and numerical analyses on real datasets reveal that our algorithm can significantly reduce the communication complexity, by as much as 95%, with almost no noticeable penalty. Moreover, it is much more robust to quantization (in terms of maintaining the true minimizer and the convergence rate) than the state-of-the-art algorithms for solving distributed optimization problems. Our results have important implications for using machine learning over internet-of-things and mobile networks.



page 1

page 2

page 3

page 4


Trading-off variance and complexity in stochastic gradient descent

Stochastic gradient descent is the method of choice for large-scale mach...

On Maintaining Linear Convergence of Distributed Learning and Optimization under Limited Communication

In parallel and distributed machine learning multiple nodes or processor...

AntNet: Distributed Stigmergetic Control for Communications Networks

This paper introduces AntNet, a novel approach to the adaptive learning ...

Communication-Efficient Distributed Optimization with Quantized Preconditioners

We investigate fast and communication-efficient algorithms for the class...

When Edge Meets Learning: Adaptive Control for Resource-Constrained Distributed Machine Learning

Emerging technologies and applications including Internet of Things (IoT...

ErrorCompensatedX: error compensation for variance reduced algorithms

Communication cost is one major bottleneck for the scalability for distr...

Adaptive Sampling Distributed Stochastic Variance Reduced Gradient for Heterogeneous Distributed Datasets

We study distributed optimization algorithms for minimizing the average ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.