Hybrid and Non-Uniform quantization methods using retro synthesis data for efficient inference

12/26/2020
by   Tej pratap GVSL, et al.
0

Existing quantization aware training methods attempt to compensate for the quantization loss by leveraging on training data, like most of the post-training quantization methods, and are also time consuming. Both these methods are not effective for privacy constraint applications as they are tightly coupled with training data. In contrast, this paper proposes a data-independent post-training quantization scheme that eliminates the need for training data. This is achieved by generating a faux dataset, hereafter referred to as Retro-Synthesis Data, from the FP32 model layer statistics and further using it for quantization. This approach outperformed state-of-the-art methods including, but not limited to, ZeroQ and DFQ on models with and without Batch-Normalization layers for 8, 6, and 4 bit precisions on ImageNet and CIFAR-10 datasets. We also introduced two futuristic variants of post-training quantization methods namely Hybrid Quantization and Non-Uniform Quantization

READ FULL TEXT
research
03/12/2021

Learnable Companding Quantization for Accurate Low-bit Neural Networks

Quantizing deep neural networks is an effective method for reducing memo...
research
02/18/2020

Gradient ℓ_1 Regularization for Quantization Robustness

We analyze the effect of quantizing weights and activations of neural ne...
research
04/08/2022

Data-Free Quantization with Accurate Activation Clipping and Adaptive Batch Normalization

Data-free quantization is a task that compresses the neural network to l...
research
11/05/2019

Post-Training 4-bit Quantization on Embedding Tables

Continuous representations have been widely adopted in recommender syste...
research
10/25/2021

Demystifying and Generalizing BinaryConnect

BinaryConnect (BC) and its many variations have become the de facto stan...
research
11/17/2019

Loss Aware Post-training Quantization

Neural network quantization enables the deployment of large models on re...
research
07/07/2022

Attention Round for Post-Training Quantization

At present, the quantification methods of neural network models are main...

Please sign up or login with your details

Forgot password? Click here to reset