Generalizable Mixed-Precision Quantization via Attribution Rank Preservation

08/05/2021
by   Ziwei Wang, et al.
0

In this paper, we propose a generalizable mixed-precision quantization (GMPQ) method for efficient inference. Conventional methods require the consistency of datasets for bitwidth search and model deployment to guarantee the policy optimality, leading to heavy search cost on challenging largescale datasets in realistic applications. On the contrary, our GMPQ searches the mixed-quantization policy that can be generalized to largescale datasets with only a small amount of data, so that the search cost is significantly reduced without performance degradation. Specifically, we observe that locating network attribution correctly is general ability for accurate visual analysis across different data distribution. Therefore, despite of pursuing higher model accuracy and complexity, we preserve attribution rank consistency between the quantized models and their full-precision counterparts via efficient capacity-aware attribution imitation for generalizable mixed-precision quantization strategy search. Extensive experiments show that our method obtains competitive accuracy-complexity trade-off compared with the state-of-the-art mixed-precision networks in significantly reduced search cost. The code is available at https://github.com/ZiweiWangTHU/GMPQ.git.

READ FULL TEXT

page 1

page 3

page 5

research
07/20/2023

EMQ: Evolving Training-free Proxies for Automated Mixed Precision Quantization

Mixed-Precision Quantization (MQ) can achieve a competitive accuracy-com...
research
08/03/2022

PalQuant: Accelerating High-precision Networks on Low-precision Accelerators

Recently low-precision deep learning accelerators (DLAs) have become pop...
research
04/13/2023

Learning Accurate Performance Predictors for Ultrafast Automated Model Compression

In this paper, we propose an ultrafast automated model compression frame...
research
08/15/2023

EQ-Net: Elastic Quantization Neural Networks

Current model quantization methods have shown their promising capability...
research
02/14/2023

Searching Transferable Mixed-Precision Quantization Policy through Large Margin Regularization

Mixed-precision quantization (MPQ) suffers from time-consuming policy se...
research
04/07/2023

AutoQNN: An End-to-End Framework for Automatically Quantizing Neural Networks

Exploring the expected quantizing scheme with suitable mixed-precision p...
research
02/19/2023

Rethinking Data-Free Quantization as a Zero-Sum Game

Data-free quantization (DFQ) recovers the performance of quantized netwo...

Please sign up or login with your details

Forgot password? Click here to reset