Distributed Learning and Democratic Embeddings: Polynomial-Time Source Coding Schemes Can Achieve Minimax Lower Bounds for Distributed Gradient Descent under Communication Cons

by   Rajarshi Saha, et al.

In this work, we consider the distributed optimization setting where information exchange between the computation nodes and the parameter server is subject to a maximum bit-budget. We first consider the problem of compressing a vector in the n-dimensional Euclidean space, subject to a bit-budget of R-bits per dimension, for which we introduce Democratic and Near-Democratic source-coding schemes. We show that these coding schemes are (near) optimal in the sense that the covering efficiency of the resulting quantizer is either dimension independent, or has a very weak logarithmic dependence. Subsequently, we propose a distributed optimization algorithm: DGD-DEF, which employs our proposed coding strategy, and achieves the minimax optimal convergence rate to within (near) constant factors for a class of communication-constrained distributed optimization algorithms. Furthermore, we extend the utility of our proposed source coding scheme by showing that it can remarkably improve the performance when used in conjunction with other compression schemes. We validate our theoretical claims through numerical simulations. Keywords: Fast democratic (Kashin) embeddings, Distributed optimization, Data-rate constraint, Quantized gradient descent, Error feedback.


page 1

page 2

page 3

page 4


On Distributed Learning with Constant Communication Bits

In this paper, we study a distributed learning problem constrained by co...

Approximate Gradient Coding with Optimal Decoding

In distributed optimization problems, a technique called gradient coding...

Communication-Efficient Distributed Optimization with Quantized Preconditioners

We investigate fast and communication-efficient algorithms for the class...

A Near-Optimal Algorithm for Univariate Zeroth-Order Budget Convex Optimization

This paper studies a natural generalization of the problem of minimizing...

Achieving the fundamental convergence-communication tradeoff with Differentially Quantized Gradient Descent

The problem of reducing the communication cost in distributed training t...

Fundamental limits of over-the-air optimization: Are analog schemes optimal?

We consider over-the-air convex optimization on a d-dimensional space wh...

Minimax Learning for Remote Prediction

The classical problem of supervised learning is to infer an accurate pre...

Please sign up or login with your details

Forgot password? Click here to reset