LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks

07/26/2018
by   Dongqing Zhang, et al.
0

Although weight and activation quantization is an effective approach for Deep Neural Network (DNN) compression and has a lot of potentials to increase inference speed leveraging bit-operations, there is still a noticeable gap in terms of prediction accuracy between the quantized model and the full-precision model. To address this gap, we propose to jointly train a quantized, bit-operation-compatible DNN and its associated quantizers, as opposed to using fixed, handcrafted quantization schemes such as uniform or logarithmic quantization. Our method for learning the quantizers applies to both network weights and activations with arbitrary-bit precision, and our quantizers are easy to train. The comprehensive experiments on CIFAR-10 and ImageNet datasets show that our method works consistently well for various network structures such as AlexNet, VGG-Net, GoogLeNet, ResNet, and DenseNet, surpassing previous quantization methods in terms of accuracy by an appreciable margin. Code available at https://github.com/Microsoft/LQ-Nets

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/23/2021

Training Quantized Deep Neural Networks via Cooperative Coevolution

This work considers a challenging Deep Neural Network (DNN) quantization...
research
02/07/2020

Switchable Precision Neural Networks

Instantaneous and on demand accuracy-efficiency trade-off has been recen...
research
11/24/2021

Sharpness-aware Quantization for Deep Neural Networks

Network quantization is an effective compression method to reduce the mo...
research
08/15/2023

EQ-Net: Elastic Quantization Neural Networks

Current model quantization methods have shown their promising capability...
research
01/15/2023

RedBit: An End-to-End Flexible Framework for Evaluating the Accuracy of Quantized CNNs

In recent years, Convolutional Neural Networks (CNNs) have become the st...
research
10/31/2022

Model Compression for DNN-Based Text-Independent Speaker Verification Using Weight Quantization

DNN-based models achieve high performance in the speaker verification (S...
research
06/09/2020

Neural Network Activation Quantization with Bitwise Information Bottlenecks

Recent researches on information bottleneck shed new light on the contin...

Please sign up or login with your details

Forgot password? Click here to reset