DEED: A General Quantization Scheme for Communication Efficiency in Bits

06/19/2020
by   Tian Ye, et al.
0

In distributed optimization, a popular technique to reduce communication is quantization. In this paper, we provide a general analysis framework for inexact gradient descent that is applicable to quantization schemes. We also propose a quantization scheme Double Encoding and Error Diminishing (DEED). DEED can achieve small communication complexity in three settings: frequent-communication large-memory, frequent-communication small-memory, and infrequent-communication (e.g. federated learning). More specifically, in the frequent-communication large-memory setting, DEED can be easily combined with Nesterov's method, so that the total number of bits required is Õ( √(κ)log 1/ϵ ), where Õ hides numerical constant and logκ factors. In the frequent-communication small-memory setting, DEED combined with SGD only requires Õ( κlog 1/ϵ) number of bits in the interpolation regime. In the infrequent communication setting, DEED combined with Federated averaging requires a smaller total number of bits than Federated Averaging. All these algorithms converge at the same rate as their non-quantized versions, while using a smaller number of bits.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/08/2021

Adaptive Quantization of Model Updates for Communication-Efficient Federated Learning

Communication of model updates between client nodes and the central aggr...
research
10/05/2021

FedDQ: Communication-Efficient Federated Learning with Descending Quantization

Federated learning (FL) is an emerging privacy-preserving distributed le...
research
07/04/2019

On the Convergence of FedAvg on Non-IID Data

Federated learning enables a large amount of edge computing devices to l...
research
07/23/2021

Finite-Bit Quantization For Distributed Algorithms With Linear Convergence

This paper studies distributed algorithms for (strongly convex) composit...
research
02/26/2020

Moniqua: Modulo Quantized Communication in Decentralized SGD

Running Stochastic Gradient Descent (SGD) in a decentralized fashion has...
research
04/09/2014

A Distributed Frank-Wolfe Algorithm for Communication-Efficient Sparse Learning

Learning sparse combinations is a frequent theme in machine learning. In...
research
11/01/2019

On Distributed Quantization for Classification

We consider the problem of distributed feature quantization, where the g...

Please sign up or login with your details

Forgot password? Click here to reset