Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN)

07/17/2018
by   Jungwook Choi, et al.
0

Deep learning algorithms achieve high classification accuracy at the expense of significant computation cost. In order to reduce this cost, several quantization schemes have gained attention recently with some focusing on weight quantization, and others focusing on quantizing activations. This paper proposes novel techniques that target weight and activation quantizations separately resulting in an overall quantized neural network (QNN). The activation quantization technique, PArameterized Clipping acTivation (PACT), uses an activation clipping parameter α that is optimized during training to find the right quantization scale. The weight quantization scheme, statistics-aware weight binning (SAWB), finds the optimal scaling factor that minimizes the quantization error based on the statistical characteristics of the distribution of weights without the need for an exhaustive search. The combination of PACT and SAWB results in a 2-bit QNN that achieves state-of-the-art classification accuracy (comparable to full precision networks) across a range of popular models and datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/16/2018

PACT: Parameterized Clipping Activation for Quantized Neural Networks

Deep learning algorithms achieve high classification accuracy at the exp...
research
08/25/2022

Efficient Activation Quantization via Adaptive Rounding Border for Post-Training Quantization

Post-training quantization (PTQ) attracts increasing attention due to it...
research
02/04/2023

Oscillation-free Quantization for Low-bit Vision Transformers

Weight oscillation is an undesirable side effect of quantization-aware t...
research
05/31/2020

Quantized Neural Networks: Characterization and Holistic Optimization

Quantized deep neural networks (QDNNs) are necessary for low-power, high...
research
12/21/2019

Towards Efficient Training for Neural Network Quantization

Quantization reduces computation costs of neural networks but suffers fr...
research
06/13/2022

Why Quantization Improves Generalization: NTK of Binary Weight Neural Networks

Quantized neural networks have drawn a lot of attention as they reduce t...
research
02/23/2018

Loss-aware Weight Quantization of Deep Networks

The huge size of deep networks hinders their use in small computing devi...

Please sign up or login with your details

Forgot password? Click here to reset