Low Precision RNNs: Quantizing RNNs Without Losing Accuracy

10/20/2017
by   Supriya Kapur, et al.
0

Similar to convolution neural networks, recurrent neural networks (RNNs) typically suffer from over-parameterization. Quantizing bit-widths of weights and activations results in runtime efficiency on hardware, yet it often comes at the cost of reduced accuracy. This paper proposes a quantization approach that increases model size with bit-width reduction. This approach will allow networks to perform at their baseline accuracy while still maintaining the benefits of reduced precision and overall model size reduction.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/31/2023

Quantized Neural Networks for Low-Precision Accumulation with Guaranteed Overflow Avoidance

We introduce a quantization-aware training algorithm that guarantees avo...
research
08/24/2016

Recurrent Neural Networks With Limited Numerical Precision

Recurrent Neural Networks (RNNs) produce state-of-art performance on man...
research
11/30/2016

Effective Quantization Methods for Recurrent Neural Networks

Reducing bit-widths of weights, activations, and gradients of a Neural N...
research
04/19/2020

MuBiNN: Multi-Level Binarized Recurrent Neural Network for EEG signal Classification

Recurrent Neural Networks (RNN) are widely used for learning sequences i...
research
12/22/2022

Training Integer-Only Deep Recurrent Neural Networks

Recurrent neural networks (RNN) are the backbone of many text and speech...
research
07/15/2017

Ternary Residual Networks

Sub-8-bit representation of DNNs incur some discernible loss of accuracy...
research
02/17/2020

Precision Gating: Improving Neural Network Efficiency with Dynamic Dual-Precision Activations

We propose precision gating (PG), an end-to-end trainable dynamic dual-p...

Please sign up or login with your details

Forgot password? Click here to reset