EQ-Net: Elastic Quantization Neural Networks

08/15/2023
by   Ke Xu, et al.
0

Current model quantization methods have shown their promising capability in reducing storage space and computation complexity. However, due to the diversity of quantization forms supported by different hardware, one limitation of existing solutions is that usually require repeated optimization for different scenarios. How to construct a model with flexible quantization forms has been less studied. In this paper, we explore a one-shot network quantization regime, named Elastic Quantization Neural Networks (EQ-Net), which aims to train a robust weight-sharing quantization supernet. First of all, we propose an elastic quantization space (including elastic bit-width, granularity, and symmetry) to adapt to various mainstream quantitative forms. Secondly, we propose the Weight Distribution Regularization Loss (WDR-Loss) and Group Progressive Guidance Loss (GPG-Loss) to bridge the inconsistency of the distribution for weights and output logits in the elastic quantization space gap. Lastly, we incorporate genetic algorithms and the proposed Conditional Quantization-Aware Accuracy Predictor (CQAP) as an estimator to quickly search mixed-precision quantized neural networks in supernet. Extensive experiments demonstrate that our EQ-Net is close to or even better than its static counterparts as well as state-of-the-art robust bit-width methods. Code can be available at \href{https://github.com/xuke225/EQ-Net.git}{https://github.com/xuke225/EQ-Net}.

READ FULL TEXT

page 2

page 7

research
07/26/2018

LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks

Although weight and activation quantization is an effective approach for...
research
11/01/2021

Arch-Net: Model Distillation for Architecture Agnostic Model Deployment

Vast requirement of computation power of Deep Neural Networks is a major...
research
09/16/2021

OMPQ: Orthogonal Mixed Precision Quantization

To bridge the ever increasing gap between deep neural networks' complexi...
research
05/14/2023

MultiQuant: A Novel Multi-Branch Topology Method for Arbitrary Bit-width Network Quantization

Arbitrary bit-width network quantization has received significant attent...
research
06/02/2023

Binary and Ternary Natural Language Generation

Ternary and binary neural networks enable multiplication-free computatio...
research
03/24/2021

Dynamic Slimmable Network

Current dynamic networks and dynamic pruning methods have shown their pr...
research
08/05/2021

Generalizable Mixed-Precision Quantization via Attribution Rank Preservation

In this paper, we propose a generalizable mixed-precision quantization (...

Please sign up or login with your details

Forgot password? Click here to reset