Cluster-Promoting Quantization with Bit-Drop for Minimizing Network Quantization Loss

09/05/2021
by   Jung Hyun Lee, et al.
0

Network quantization, which aims to reduce the bit-lengths of the network weights and activations, has emerged for their deployments to resource-limited devices. Although recent studies have successfully discretized a full-precision network, they still incur large quantization errors after training, thus giving rise to a significant performance gap between a full-precision network and its quantized counterpart. In this work, we propose a novel quantization method for neural networks, Cluster-Promoting Quantization (CPQ) that finds the optimal quantization grids while naturally encouraging the underlying full-precision weights to gather around those quantization grids cohesively during training. This property of CPQ is thanks to our two main ingredients that enable differentiable quantization: i) the use of the categorical distribution designed by a specific probabilistic parametrization in the forward pass and ii) our proposed multi-class straight-through estimator (STE) in the backward pass. Since our second component, multi-class STE, is intrinsically biased, we additionally propose a new bit-drop technique, DropBits, that revises the standard dropout regularization to randomly drop bits instead of neurons. As a natural extension of DropBits, we further introduce the way of learning heterogeneous quantization levels to find proper bit-length for each layer by imposing an additional regularization on DropBits. We experimentally validate our method on various benchmark datasets and network architectures, and also support a new hypothesis for quantization: learning heterogeneous quantization levels outperforms the case using the same but fixed quantization levels from scratch.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/29/2019

Semi-Relaxed Quantization with DropBits: Training Low-Bit Neural Networks via Bit-wise Regularization

Neural Network quantization, which aims to reduce bit-lengths of the net...
research
04/20/2018

Value-aware Quantization for Training and Inference of Neural Networks

We propose a novel value-aware quantization which applies aggressively r...
research
02/18/2020

Gradient ℓ_1 Regularization for Quantization Robustness

We analyze the effect of quantizing weights and activations of neural ne...
research
02/27/2019

Cluster Regularized Quantization for Deep Networks Compression

Deep neural networks (DNNs) have achieved great success in a wide range ...
research
11/12/2022

Exploiting the Partly Scratch-off Lottery Ticket for Quantization-Aware Training

Quantization-aware training (QAT) receives extensive popularity as it we...
research
02/29/2020

Gradient-Based Deep Quantization of Neural Networks through Sinusoidal Adaptive Regularization

As deep neural networks make their ways into different domains, their co...
research
06/24/2022

QReg: On Regularization Effects of Quantization

In this paper we study the effects of quantization in DNN training. We h...

Please sign up or login with your details

Forgot password? Click here to reset