DNQ: Dynamic Network Quantization

12/06/2018
by   Yuhui Xu, et al.
0

Network quantization is an effective method for the deployment of neural networks on memory and energy constrained mobile devices. In this paper, we propose a Dynamic Network Quantization (DNQ) framework which is composed of two modules: a bit-width controller and a quantizer. Unlike most existing quantization methods that use a universal quantization bit-width for the whole network, we utilize policy gradient to train an agent to learn the bit-width of each layer by the bit-width controller. This controller can make a trade-off between accuracy and compression ratio. Given the quantization bit-width sequence, the quantizer adopts the quantization distance as the criterion of the weights importance during quantization. We extensively validate the proposed approach on various main-stream neural networks and obtain impressive results.

READ FULL TEXT
research
03/06/2018

Deep Neural Network Compression with Single and Multiple Level Quantization

Network quantization is an effective solution to compress deep neural ne...
research
05/14/2023

MultiQuant: A Novel Multi-Branch Topology Method for Arbitrary Bit-width Network Quantization

Arbitrary bit-width network quantization has received significant attent...
research
04/21/2022

Arbitrary Bit-width Network: A Joint Layer-Wise Quantization and Adaptive Inference Approach

Conventional model quantization methods use a fixed quantization scheme ...
research
06/02/2022

NIPQ: Noise Injection Pseudo Quantization for Automated DNN Optimization

The optimization of neural networks in terms of computation cost and mem...
research
12/04/2017

Adaptive Quantization for Deep Neural Network

In recent years Deep Neural Networks (DNNs) have been rapidly developed ...
research
03/02/2021

All at Once Network Quantization via Collaborative Knowledge Transfer

Network quantization has rapidly become one of the most widely used meth...
research
05/14/2020

Bayesian Bits: Unifying Quantization and Pruning

We introduce Bayesian Bits, a practical method for joint mixed precision...

Please sign up or login with your details

Forgot password? Click here to reset