Distributed Learning and Democratic Embeddings: Polynomial-Time Source Coding Schemes Can Achieve Minimax Lower Bounds for Distributed Gradient Descent under Communication Cons

03/13/2021
by   Rajarshi Saha, et al.
11

In this work, we consider the distributed optimization setting where information exchange between the computation nodes and the parameter server is subject to a maximum bit-budget. We first consider the problem of compressing a vector in the n-dimensional Euclidean space, subject to a bit-budget of R-bits per dimension, for which we introduce Democratic and Near-Democratic source-coding schemes. We show that these coding schemes are (near) optimal in the sense that the covering efficiency of the resulting quantizer is either dimension independent, or has a very weak logarithmic dependence. Subsequently, we propose a distributed optimization algorithm: DGD-DEF, which employs our proposed coding strategy, and achieves the minimax optimal convergence rate to within (near) constant factors for a class of communication-constrained distributed optimization algorithms. Furthermore, we extend the utility of our proposed source coding scheme by showing that it can remarkably improve the performance when used in conjunction with other compression schemes. We validate our theoretical claims through numerical simulations. Keywords: Fast democratic (Kashin) embeddings, Distributed optimization, Data-rate constraint, Quantized gradient descent, Error feedback.

READ FULL TEXT

Authors

page 1

page 2

page 3

page 4

09/14/2021

On Distributed Learning with Constant Communication Bits

In this paper, we study a distributed learning problem constrained by co...
06/17/2020

Approximate Gradient Coding with Optimal Decoding

In distributed optimization problems, a technique called gradient coding...
02/14/2021

Communication-Efficient Distributed Optimization with Quantized Preconditioners

We investigate fast and communication-efficient algorithms for the class...
02/18/2021

Don't Fix What ain't Broke: Near-optimal Local Convergence of Alternating Gradient Descent-Ascent for Minimax Optimization

Minimax optimization has recently gained a lot of attention as adversari...
02/06/2020

Achieving the fundamental convergence-communication tradeoff with Differentially Quantized Gradient Descent

The problem of reducing the communication cost in distributed training t...
05/31/2018

Minimax Learning for Remote Prediction

The classical problem of supervised learning is to infer an accurate pre...
09/17/2020

Binarized Johnson-Lindenstrauss embeddings

We consider the problem of encoding a set of vectors into a minimal numb...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.