Data Quality-aware Mixed-precision Quantization via Hybrid Reinforcement Learning

02/09/2023
by   Yingchun Wang, et al.
0

Mixed-precision quantization mostly predetermines the model bit-width settings before actual training due to the non-differential bit-width sampling process, obtaining sub-optimal performance. Worse still, the conventional static quality-consistent training setting, i.e., all data is assumed to be of the same quality across training and inference, overlooks data quality changes in real-world applications which may lead to poor robustness of the quantized models. In this paper, we propose a novel Data Quality-aware Mixed-precision Quantization framework, dubbed DQMQ, to dynamically adapt quantization bit-widths to different data qualities. The adaption is based on a bit-width decision policy that can be learned jointly with the quantization training. Concretely, DQMQ is modeled as a hybrid reinforcement learning (RL) task that combines model-based policy optimization with supervised quantization training. By relaxing the discrete bit-width sampling to a continuous probability distribution that is encoded with few learnable parameters, DQMQ is differentiable and can be directly optimized end-to-end with a hybrid optimization target considering both task performance and quantization benefits. Trained on mixed-quality image datasets, DQMQ can implicitly select the most proper bit-width for each layer when facing uneven input qualities. Extensive experiments on various benchmark datasets and networks demonstrate the superiority of DQMQ against existing fixed/mixed-precision quantization methods.

READ FULL TEXT

page 1

page 6

page 9

research
07/04/2020

FracBits: Mixed Precision Quantization via Fractional Bit-Widths

Model quantization helps to reduce model size and latency of deep neural...
research
10/02/2019

Quantized Reinforcement Learning (QUARL)

Recent work has shown that quantization can help reduce the memory, comp...
research
02/14/2023

Searching Transferable Mixed-Precision Quantization Policy through Large Margin Regularization

Mixed-precision quantization (MPQ) suffers from time-consuming policy se...
research
05/14/2020

Bayesian Bits: Unifying Quantization and Pruning

We introduce Bayesian Bits, a practical method for joint mixed precision...
research
04/24/2020

Automatic low-bit hybrid quantization of neural networks through meta learning

Model quantization is a widely used technique to compress and accelerate...
research
03/31/2021

Bit-Mixer: Mixed-precision networks with runtime bit-width selection

Mixed-precision networks allow for a variable bit-width quantization for...
research
09/05/2023

OHQ: On-chip Hardware-aware Quantization

Quantization emerges as one of the most promising approaches for deployi...

Please sign up or login with your details

Forgot password? Click here to reset