Learned Step Size Quantization

02/21/2019
by   Steven K. Esser, et al.
6

We present here Learned Step Size Quantization, a method for training deep networks such that they can run at inference time using low precision integer matrix multipliers, which offer power and space advantages over high precision alternatives. The essence of our approach is to learn the step size parameter of a uniform quantizer by backpropagation of the training loss, applying a scaling factor to its learning rate, and computing its associated loss gradient by ignoring the discontinuity present in the quantizer. This quantization approach can be applied to activations or weights, using different levels of precision as needed for a given system, and requiring only a simple modification of existing training code. As demonstrated on the ImageNet dataset, our approach achieves better accuracy than all previous published methods for creating quantized networks on several ResNet network architectures at 2-, 3- and 4-bits of precision.

READ FULL TEXT
research
04/29/2018

UNIQ: Uniform Noise Injection for the Quantization of Neural Networks

We present a novel method for training deep neural network amenable to i...
research
02/18/2022

LG-LSQ: Learned Gradient Linear Symmetric Quantization

Deep neural networks with lower precision weights and operations at infe...
research
05/27/2019

Differentiable Quantization of Deep Neural Networks

We propose differentiable quantization (DQ) for efficient deep neural ne...
research
06/15/2022

Edge Inference with Fully Differentiable Quantized Mixed Precision Neural Networks

The large computing and memory cost of deep neural networks (DNNs) often...
research
02/23/2018

Autoencoder based image compression: can the learning be quantization independent?

This paper explores the problem of learning transforms for image compres...
research
09/11/2018

Discovering Low-Precision Networks Close to Full-Precision Networks for Efficient Embedded Inference

To realize the promise of ubiquitous embedded deep network inference, it...
research
01/20/2023

Optimized learned entropy coding parameters for practical neural-based image and video compression

Neural-based image and video codecs are significantly more power-efficie...

Please sign up or login with your details

Forgot password? Click here to reset