Efficient Quantization-aware Training with Adaptive Coreset Selection

06/12/2023
by   Xijie Huang, et al.
0

The expanding model size and computation of deep neural networks (DNNs) have increased the demand for efficient model deployment methods. Quantization-aware training (QAT) is a representative model compression method to leverage redundancy in weights and activations. However, most existing QAT methods require end-to-end training on the entire dataset, which suffers from long training time and high energy costs. Coreset selection, aiming to improve data efficiency utilizing the redundancy of training data, has also been widely used for efficient training. In this work, we propose a new angle through the coreset selection to improve the training efficiency of quantization-aware training. Based on the characteristics of QAT, we propose two metrics: error vector score and disagreement score, to quantify the importance of each sample during training. Guided by these two metrics of importance, we proposed a quantization-aware adaptive coreset selection (ACS) method to select the data for the current training epoch. We evaluate our method on various networks (ResNet-18, MobileNetV2), datasets(CIFAR-100, ImageNet-1K), and under different quantization settings. Compared with previous coreset selection methods, our method significantly improves QAT performance with different dataset fractions. Our method can achieve an accuracy of 68.39 the ImageNet-1K dataset with only a 10 4.24

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/18/2020

Gradient ℓ_1 Regularization for Quantization Robustness

We analyze the effect of quantizing weights and activations of neural ne...
research
12/24/2022

Hyperspherical Quantization: Toward Smaller and More Accurate Models

Model quantization enables the deployment of deep neural networks under ...
research
12/11/2022

Error-aware Quantization through Noise Tempering

Quantization has become a predominant approach for model compression, en...
research
04/21/2023

Picking Up Quantization Steps for Compressed Image Classification

The sensitivity of deep neural networks to compressed images hinders the...
research
03/15/2022

Implicit Feature Decoupling with Depthwise Quantization

Quantization has been applied to multiple domains in Deep Neural Network...
research
06/06/2020

Generative Design of Hardware-aware DNNs

To efficiently run DNNs on the edge/cloud, many new DNN inference accele...
research
07/01/2023

Variation-aware Vision Transformer Quantization

Despite the remarkable performance of Vision Transformers (ViTs) in vari...

Please sign up or login with your details

Forgot password? Click here to reset