Training Multi-bit Quantized and Binarized Networks with A Learnable Symmetric Quantizer

04/01/2021
by   Phuoc Pham, et al.
0

Quantizing weights and activations of deep neural networks is essential for deploying them in resource-constrained devices, or cloud platforms for at-scale services. While binarization is a special case of quantization, this extreme case often leads to several training difficulties, and necessitates specialized models and training methods. As a result, recent quantization methods do not provide binarization, thus losing the most resource-efficient option, and quantized and binarized networks have been distinct research areas. We examine binarization difficulties in a quantization framework and find that all we need to enable the binary training are a symmetric quantizer, good initialization, and careful hyperparameter selection. These techniques also lead to substantial improvements in multi-bit quantization. We demonstrate our unified quantization framework, denoted as UniQ, on the ImageNet dataset with various architectures such as ResNet-18,-34 and MobileNetV2. For multi-bit quantization, UniQ outperforms existing methods to achieve the state-of-the-art accuracy. In binarization, the achieved accuracy is comparable to existing state-of-the-art methods even without modifying the original architectures.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/12/2021

Learnable Companding Quantization for Accurate Low-bit Neural Networks

Quantizing deep neural networks is an effective method for reducing memo...
research
12/20/2019

AdaBits: Neural Network Quantization with Adaptive Bit-Widths

Deep neural networks with adaptive configurations have gained increasing...
research
05/10/2021

In-Hindsight Quantization Range Estimation for Quantized Training

Quantization techniques applied to the inference of deep neural networks...
research
04/20/2020

LSQ+: Improving low-bit quantization through learnable offsets and better initialization

Unlike ReLU, newer activation functions (like Swish, H-swish, Mish) that...
research
06/26/2022

CTMQ: Cyclic Training of Convolutional Neural Networks with Multiple Quantization Steps

This paper proposes a training method having multiple cyclic training fo...
research
05/07/2021

Pareto-Optimal Quantized ResNet Is Mostly 4-bit

Quantization has become a popular technique to compress neural networks ...
research
05/27/2021

Towards Efficient Full 8-bit Integer DNN Online Training on Resource-limited Devices without Batch Normalization

Huge computational costs brought by convolution and batch normalization ...

Please sign up or login with your details

Forgot password? Click here to reset