Adaptive Loss-aware Quantization for Multi-bit Networks

12/18/2019
by   Zhongnan Qu, et al.
0

We investigate the compression of deep neural networks by quantizing their weights and activations into multiple binary bases, known as multi-bit networks (MBNs), which accelerates the inference and reduces the storage for deployment on low-resource mobile and embedded platforms. We propose Adaptive Loss-aware Quantization (ALQ), a new MBN quantization pipeline that is able to achieve an average bitwidth below one bit without notable loss in inference accuracy. Unlike previous MBN quantization solutions that train a quantizer by minimizing the error to reconstruct full precision weights, ALQ directly minimizes the quantization-induced error on the loss function involving neither gradient approximation nor full precision calculations. ALQ also exploits strategies including adaptive bitwidth, smooth bitwidth reduction, and iterative trained quantization to allow a smaller network size without loss in accuracy. Experiment results on popular image datasets show that ALQ outperforms state-of-the-art compressed networks in terms of both storage and accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/24/2022

Hyperspherical Quantization: Toward Smaller and More Accurate Models

Model quantization enables the deployment of deep neural networks under ...
research
05/13/2021

Quantized Proximal Averaging Network for Analysis Sparse Coding

We solve the analysis sparse coding problem considering a combination of...
research
02/18/2019

Low-bit Quantization of Neural Networks for Efficient Inference

Recent breakthrough methods in machine learning make use of increasingly...
research
06/28/2023

DNA-TEQ: An Adaptive Exponential Quantization of Tensors for DNN Inference

Quantization is commonly used in Deep Neural Networks (DNNs) to reduce t...
research
12/20/2019

AdaBits: Neural Network Quantization with Adaptive Bit-Widths

Deep neural networks with adaptive configurations have gained increasing...
research
11/05/2016

Loss-aware Binarization of Deep Networks

Deep neural network models, though very powerful and highly successful, ...
research
02/23/2018

Loss-aware Weight Quantization of Deep Networks

The huge size of deep networks hinders their use in small computing devi...

Please sign up or login with your details

Forgot password? Click here to reset