Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation

04/20/2020
by   Hao Wu, et al.
10

Quantization techniques can reduce the size of Deep Neural Networks and improve inference latency and throughput by taking advantage of high throughput integer instructions. In this paper we review the mathematical aspects of quantization parameters and evaluate their choices on a wide range of neural network models for different application domains, including vision, speech, and language. We focus on quantization techniques that are amenable to acceleration by processors with high-throughput integer math pipelines. We also present a workflow for 8-bit quantization that is able to maintain accuracy within 1 the floating-point baseline on all networks studied, including models that are more difficult to quantize, such as MobileNets and BERT-large.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/17/2020

Towards Fully 8-bit Integer Inference for the Transformer Model

8-bit integer inference, as a promising direction in reducing both the l...
research
01/05/2021

I-BERT: Integer-only BERT Quantization

Transformer based models, like BERT and RoBERTa, have achieved state-of-...
research
07/15/2016

On the efficient representation and execution of deep acoustic models

In this paper we present a simple and computationally efficient quantiza...
research
01/20/2022

Neural Network Quantization with AI Model Efficiency Toolkit (AIMET)

While neural networks have advanced the frontiers in many machine learni...
research
11/20/2020

Empirical Evaluation of Deep Learning Model Compression Techniques on the WaveNet Vocoder

WaveNet is a state-of-the-art text-to-speech vocoder that remains challe...
research
07/18/2022

Is Integer Arithmetic Enough for Deep Learning Training?

The ever-increasing computational complexity of deep learning models mak...
research
12/22/2022

Training Integer-Only Deep Recurrent Neural Networks

Recurrent neural networks (RNN) are the backbone of many text and speech...

Please sign up or login with your details

Forgot password? Click here to reset