Relaxed Quantization for Discretized Neural Networks

10/03/2018
by   Christos Louizos, et al.
2

Neural network quantization has become an important research area due to its great impact on deployment of large models on resource constrained devices. In order to train networks that can be effectively discretized without loss of performance, we introduce a differentiable quantization procedure. Differentiability can be achieved by transforming continuous distributions over the weights and activations of the network to categorical distributions over the quantization grid. These are subsequently relaxed to continuous surrogates that can allow for efficient gradient-based optimization. We further show that stochastic rounding can be seen as a special case of the proposed approach and that under this formulation the quantization grid itself can also be optimized with gradient descent. We experimentally validate the performance of our method on MNIST, CIFAR 10 and Imagenet classification.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/29/2019

Semi-Relaxed Quantization with DropBits: Training Low-Bit Neural Networks via Bit-wise Regularization

Neural Network quantization, which aims to reduce bit-lengths of the net...
research
10/25/2021

Demystifying and Generalizing BinaryConnect

BinaryConnect (BC) and its many variations have become the de facto stan...
research
03/21/2019

Feature quantization for parsimonious and interpretable predictive models

For regulatory and interpretability reasons, logistic regression is stil...
research
10/18/2019

Mirror Descent View for Neural Network Quantization

Quantizing large Neural Networks (NN) while maintaining the performance ...
research
06/23/2020

Differentiable Segmentation of Sequences

Segmented models are widely used to describe non-stationary sequential d...
research
07/19/2020

DBQ: A Differentiable Branch Quantizer for Lightweight Deep Neural Networks

Deep neural networks have achieved state-of-the art performance on vario...
research
10/16/2022

FIT: A Metric for Model Sensitivity

Model compression is vital to the deployment of deep learning on edge de...

Please sign up or login with your details

Forgot password? Click here to reset