Generative Low-bitwidth Data Free Quantization

03/07/2020
by   Shoukai Xu, et al.
2

Neural network quantization is an effective way to compress deep models and improve the execution latency and energy efficiency, so that they can be deployed on mobile or embedded devices. Existing quantization methods require original data for calibration or fine-tuning to get better performance. However, in many real-world scenarios, the data may not be available due to confidential or private issues, making existing quantization methods not applicable. Moreover, due to the absence of original data, the recently developed generative adversarial networks (GANs) can not be applied to generate data. Although the full precision model may contain the entire data information, such information alone is hard to exploit for recovering the original data or generating new meaningful data. In this paper, we investigate a simple-yet-effective method called Generative Low-bitwidth Data Free Quantization to remove the data dependence burden. Specifically, we propose a Knowledge Matching Generator to produce meaningful fake data by exploiting classification boundary knowledge and distribution information in the pre-trained model. With the help of generated data, we are able to quantize a model by learning knowledge from the pre-trained model. Extensive experiments on three data sets demonstrate the effectiveness of our method. More critically, our method achieves much higher accuracy on 4-bit quantization than the existing data free quantization method.

READ FULL TEXT
research
10/26/2022

Zero-Shot Learning of a Conditional Generative Adversarial Network for Data-Free Network Quantization

We propose a novel method for training a conditional generative adversar...
research
11/19/2020

Learning in School: Multi-teacher Knowledge Inversion for Data-Free Quantization

User data confidentiality protection is becoming a rising challenge in t...
research
05/29/2023

LLM-QAT: Data-Free Quantization Aware Training for Large Language Models

Several post-training quantization methods have been applied to large la...
research
01/24/2019

QGAN: Quantized Generative Adversarial Networks

The intensive computation and memory requirements of generative adversar...
research
04/08/2022

Data-Free Quantization with Accurate Activation Clipping and Adaptive Batch Normalization

Data-free quantization is a task that compresses the neural network to l...
research
11/04/2021

Qimera: Data-free Quantization with Synthetic Boundary Supporting Samples

Model quantization is known as a promising method to compress deep neura...
research
06/17/2020

StatAssist GradBoost: A Study on Optimal INT8 Quantization-aware Training from Scratch

This paper studies the scratch training of quantization-aware training (...

Please sign up or login with your details

Forgot password? Click here to reset