Saving RNN Computations with a Neuron-Level Fuzzy Memoization Scheme

02/14/2022
by   Franyell Silfa, et al.
0

Recurrent Neural Networks (RNNs) are a key technology for applications such as automatic speech recognition or machine translation. Unlike conventional feed-forward DNNs, RNNs remember past information to improve the accuracy of future predictions and, therefore, they are very effective for sequence processing problems. For each application run, recurrent layers are executed many times for processing a potentially large sequence of inputs (words, images, audio frames, etc.). In this paper, we observe that the output of a neuron exhibits small changes in consecutive invocations. We exploit this property to build a neuron-level fuzzy memoization scheme, which dynamically caches each neuron's output and reuses it whenever it is predicted that the current output will be similar to a previously computed result, avoiding in this way the output computations. The main challenge in this scheme is determining whether the new neuron's output for the current input in the sequence will be similar to a recently computed result. To this end, we extend the recurrent layer with a much simpler Bitwise Neural Network (BNN), and show that the BNN and RNN outputs are highly correlated: if two BNN outputs are very similar, the corresponding outputs in the original RNN layer are likely to exhibit negligible changes. The BNN provides a low-cost and effective mechanism for deciding when fuzzy memoization can be applied with a small impact on accuracy. We evaluate our memoization scheme on top of a state-of-the-art accelerator for RNNs, for a variety of different neural networks from multiple application domains. We show that our technique avoids more than 26.7% of computations, resulting in 21% energy savings and 1.4x speedup on average.

READ FULL TEXT
research
12/16/2016

Delta Networks for Optimized Recurrent Network Computation

Many neural networks exhibit stability in their activation patterns over...
research
11/14/2012

Sequence Transduction with Recurrent Neural Networks

Many machine learning tasks can be expressed as the transformation---or ...
research
11/07/2018

RNNFast: An Accelerator for Recurrent Neural Networks Using Domain Wall Memory

Recurrent Neural Networks (RNNs) are an important class of neural networ...
research
07/20/2020

SeqPoint: Identifying Representative Iterations of Sequence-based Neural Networks

The ubiquity of deep neural networks (DNNs) continues to rise, making th...
research
02/10/2022

Mixture-of-Rookies: Saving DNN Computations by Predicting ReLU Outputs

Deep Neural Networks (DNNs) are widely used in many applications domains...
research
04/05/2019

Measuring scheduling efficiency of RNNs for NLP applications

Recurrent neural networks (RNNs) have shown state of the art results for...
research
12/23/2021

Forward Composition Propagation for Explainable Neural Reasoning

This paper proposes an algorithm called Forward Composition Propagation ...

Please sign up or login with your details

Forgot password? Click here to reset