On the quantization of recurrent neural networks

01/14/2021
by   Jian Li, et al.
0

Integer quantization of neural networks can be defined as the approximation of the high precision computation of the canonical neural network formulation, using reduced integer precision. It plays a significant role in the efficient deployment and execution of machine learning (ML) systems, reducing memory consumption and leveraging typically faster computations. In this work, we present an integer-only quantization strategy for Long Short-Term Memory (LSTM) neural network topologies, which themselves are the foundation of many production ML systems. Our quantization strategy is accurate (e.g. works well with quantization post-training), efficient and fast to execute (utilizing 8 bit integer weights and mostly 8 bit activations), and is able to target a variety of hardware (by leveraging instructions sets available in common CPU architectures, as well as available neural accelerators).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/15/2016

On the efficient representation and execution of deep acoustic models

In this paper we present a simple and computationally efficient quantiza...
research
02/01/2018

Alternating Multi-bit Quantization for Recurrent Neural Networks

Recurrent neural networks have achieved excellent performance in many ap...
research
07/11/2018

FINN-L: Library Extensions and Design Trade-off Analysis for Variable Precision LSTM Networks on FPGAs

It is well known that many types of artificial neural networks, includin...
research
08/12/2020

FATNN: Fast and Accurate Ternary Neural Networks

Ternary Neural Networks (TNNs) have received much attention due to being...
research
05/29/2019

Instant Quantization of Neural Networks using Monte Carlo Methods

Low bit-width integer weights and activations are very important for eff...
research
03/27/2021

Automated Backend-Aware Post-Training Quantization

Quantization is a key technique to reduce the resource requirement and i...
research
08/31/2021

Quantization of Generative Adversarial Networks for Efficient Inference: a Methodological Study

Generative adversarial networks (GANs) have an enormous potential impact...

Please sign up or login with your details

Forgot password? Click here to reset