Loss-aware Weight Quantization of Deep Networks

02/23/2018
by   Lu Hou, et al.
0

The huge size of deep networks hinders their use in small computing devices. In this paper, we consider compressing the network by weight quantization. We extend a recently proposed loss-aware weight binarization scheme to ternarization, with possibly different scaling parameters for the positive and negative weights, and m-bit (where m > 2) quantization. Experiments on feedforward and recurrent neural networks show that the proposed scheme outperforms state-of-the-art weight quantization algorithms, and is as accurate (or even more accurate) than the full-precision network.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/24/2022

Standard Deviation-Based Quantization for Deep Neural Networks

Quantization of deep neural networks is a promising approach that reduce...
research
11/05/2016

Loss-aware Binarization of Deep Networks

Deep neural network models, though very powerful and highly successful, ...
research
07/17/2018

Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN)

Deep learning algorithms achieve high classification accuracy at the exp...
research
12/18/2019

Adaptive Loss-aware Quantization for Multi-bit Networks

We investigate the compression of deep neural networks by quantizing the...
research
11/13/2018

Iteratively Training Look-Up Tables for Network Quantization

Operating deep neural networks on devices with limited resources require...
research
05/30/2019

Quantization Loss Re-Learning Method

In order to quantize the gate parameters of the LSTM (Long Short-Term Me...
research
08/11/2020

PROFIT: A Novel Training Method for sub-4-bit MobileNet Models

4-bit and lower precision mobile models are required due to the ever-inc...

Please sign up or login with your details

Forgot password? Click here to reset