Communication-efficient Variance-reduced Stochastic Gradient Descent

03/10/2020
by   Hossein S. Ghadikolaei, et al.
0

We consider the problem of communication efficient distributed optimization where multiple nodes exchange important algorithm information in every iteration to solve large problems. In particular, we focus on the stochastic variance-reduced gradient and propose a novel approach to make it communication-efficient. That is, we compress the communicated information to a few bits while preserving the linear convergence rate of the original uncompressed algorithm. Comprehensive theoretical and numerical analyses on real datasets reveal that our algorithm can significantly reduce the communication complexity, by as much as 95%, with almost no noticeable penalty. Moreover, it is much more robust to quantization (in terms of maintaining the true minimizer and the convergence rate) than the state-of-the-art algorithms for solving distributed optimization problems. Our results have important implications for using machine learning over internet-of-things and mobile networks.

READ FULL TEXT

Authors

page 1

page 2

page 3

page 4

03/22/2016

Trading-off variance and complexity in stochastic gradient descent

Stochastic gradient descent is the method of choice for large-scale mach...
02/26/2019

On Maintaining Linear Convergence of Distributed Learning and Optimization under Limited Communication

In parallel and distributed machine learning multiple nodes or processor...
05/27/2011

AntNet: Distributed Stigmergetic Control for Communications Networks

This paper introduces AntNet, a novel approach to the adaptive learning ...
02/14/2021

Communication-Efficient Distributed Optimization with Quantized Preconditioners

We investigate fast and communication-efficient algorithms for the class...
04/14/2018

When Edge Meets Learning: Adaptive Control for Resource-Constrained Distributed Machine Learning

Emerging technologies and applications including Internet of Things (IoT...
08/04/2021

ErrorCompensatedX: error compensation for variance reduced algorithms

Communication cost is one major bottleneck for the scalability for distr...
02/20/2020

Adaptive Sampling Distributed Stochastic Variance Reduced Gradient for Heterogeneous Distributed Datasets

We study distributed optimization algorithms for minimizing the average ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.