DNN gradient lossless compression: Can GenNorm be the answer?

11/15/2021
by   Zhong-Jing Chen, et al.
0

In this paper, the problem of optimal gradient lossless compression in Deep Neural Network (DNN) training is considered. Gradient compression is relevant in many distributed DNN training scenarios, including the recently popular federated learning (FL) scenario in which each remote users are connected to the parameter server (PS) through a noiseless but rate limited channel. In distributed DNN training, if the underlying gradient distribution is available, classical lossless compression approaches can be used to reduce the number of bits required for communicating the gradient entries. Mean field analysis has suggested that gradient updates can be considered as independent random variables, while Laplace approximation can be used to argue that gradient has a distribution approximating the normal (Norm) distribution in some regimes. In this paper we argue that, for some networks of practical interest, the gradient entries can be well modelled as having a generalized normal (GenNorm) distribution. We provide numerical evaluations to validate that the hypothesis GenNorm modelling provides a more accurate prediction of the DNN gradient tail distribution. Additionally, this modeling choice provides concrete improvement in terms of lossless compression of the gradients when applying classical fix-to-variable lossless coding algorithms, such as Huffman coding, to the quantized gradient updates. This latter results indeed provides an effective compression strategy with low memory and computational complexity that has great practical relevance in distributed DNN training scenarios.

READ FULL TEXT
research
02/06/2022

Lossy Gradient Compression: How Much Accuracy Can One Bit Buy?

In federated learning (FL), a global model is trained at a Parameter Ser...
research
04/18/2022

How to Attain Communication-Efficient DNN Training? Convert, Compress, Correct

In this paper, we introduce 𝖢𝖮_3, an algorithm for communication-efficie...
research
01/23/2023

M22: A Communication-Efficient Algorithm for Federated Learning Inspired by Rate-Distortion

In federated learning (FL), the communication constraint between the rem...
research
10/31/2022

L-GreCo: An Efficient and General Framework for Layerwise-Adaptive Gradient Compression

Data-parallel distributed training of deep neural networks (DNN) has gai...
research
04/11/2023

Communication Efficient DNN Partitioning-based Federated Learning

Efficiently running federated learning (FL) on resource-constrained devi...
research
03/17/2022

Convert, compress, correct: Three steps toward communication-efficient DNN training

In this paper, we introduce a novel algorithm, 𝖢𝖮_3, for communication-e...
research
10/18/2022

Generalized Many-Body Dispersion Correction through Random-phase Approximation for Chemically Accurate Density Functional Theory

We extend our recently proposed Deep Learning-aided many-body dispersion...

Please sign up or login with your details

Forgot password? Click here to reset