Logarithmic Unbiased Quantization: Practical 4-bit Training in Deep Learning

12/19/2021
by   Brian Chmiel, et al.
0

Quantization of the weights and activations is one of the main methods to reduce the computational footprint of Deep Neural Networks (DNNs) training. Current methods enable 4-bit quantization of the forward phase. However, this constitutes only a third of the training process. Reducing the computational footprint of the entire training process requires the quantization of the neural gradients, i.e., the loss gradients with respect to the outputs of intermediate neural layers. In this work, we examine the importance of having unbiased quantization in quantized neural network training, where to maintain it, and how. Based on this, we suggest a logarithmic unbiased quantization (LUQ) method to quantize both the forward and backward phase to 4-bit, achieving state-of-the-art results in 4-bit training without overhead. For example, in ResNet50 on ImageNet, we achieved a degradation of 1.18 further improve this to degradation of only 0.64 precision fine-tuning combined with a variance reduction method – both add overhead comparable to previously suggested methods. Finally, we suggest a method that uses the low precision format to avoid multiplications during two-thirds of the training process, thus reducing by 5x the area used by the multiplier.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/14/2019

Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks

Hardware-friendly network quantization (e.g., binary/uniform quantizatio...
research
10/27/2020

A Statistical Framework for Low-bitwidth Training of Deep Neural Networks

Fully quantized training (FQT), which uses low-bitwidth hardware by quan...
research
04/09/2020

Dithered backprop: A sparse and quantized backpropagation algorithm for more efficient deep neural network training

Deep Neural Networks are successful but highly computationally expensive...
research
05/25/2018

Scalable Methods for 8-bit Training of Neural Networks

Quantized Neural Networks (QNNs) are often used to improve network effic...
research
02/01/2022

Few-Bit Backward: Quantized Gradients of Activation Functions for Memory Footprint Reduction

Memory footprint is one of the main limiting factors for large neural ne...
research
10/12/2018

Quantization for Rapid Deployment of Deep Neural Networks

This paper aims at rapid deployment of the state-of-the-art deep neural ...
research
12/24/2020

FracTrain: Fractionally Squeezing Bit Savings Both Temporally and Spatially for Efficient DNN Training

Recent breakthroughs in deep neural networks (DNNs) have fueled a tremen...

Please sign up or login with your details

Forgot password? Click here to reset