On the efficient representation and execution of deep acoustic models

07/15/2016
by   Raziel Alvarez, et al.
0

In this paper we present a simple and computationally efficient quantization scheme that enables us to reduce the resolution of the parameters of a neural network from 32-bit floating point values to 8-bit integer values. The proposed quantization scheme leads to significant memory savings and enables the use of optimized hardware instructions for integer arithmetic, thus significantly reducing the cost of inference. Finally, we propose a "quantization aware" training process that applies the proposed scheme during network training and find that it allows us to recover most of the loss in accuracy introduced by quantization. We validate the proposed techniques by applying them to a long short-term memory-based acoustic model on an open-ended large vocabulary speech recognition task.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/04/2022

I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference

Vision Transformers (ViTs) have achieved state-of-the-art performance on...
research
01/14/2021

On the quantization of recurrent neural networks

Integer quantization of neural networks can be defined as the approximat...
research
09/28/2020

NITI: Training Integer Neural Networks Using Integer-only Arithmetic

While integer arithmetic has been widely adopted for improved performanc...
research
04/20/2020

Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation

Quantization techniques can reduce the size of Deep Neural Networks and ...
research
05/30/2019

Quantization Loss Re-Learning Method

In order to quantize the gate parameters of the LSTM (Long Short-Term Me...
research
10/14/2022

Accelerating RNN-based Speech Enhancement on a Multi-Core MCU with Mixed FP16-INT8 Post-Training Quantization

This paper presents an optimized methodology to design and deploy Speech...
research
06/21/2020

Efficient Integer-Arithmetic-Only Convolutional Neural Networks

Integer-arithmetic-only networks have been demonstrated effective to red...

Please sign up or login with your details

Forgot password? Click here to reset