Training LSTM Networks with Resistive Cross-Point Devices

06/01/2018
by   Tayfun Gokmen, et al.
0

In our previous work we have shown that resistive cross point devices, so called Resistive Processing Unit (RPU) devices, can provide significant power and speed benefits when training deep fully connected networks as well as convolutional neural networks. In this work, we further extend the RPU concept for training recurrent neural networks (RNNs) namely LSTMs. We show that the mapping of recurrent layers is very similar to the mapping of fully connected layers and therefore the RPU concept can potentially provide large acceleration factors for RNNs as well. In addition, we study the effect of various device imperfections and system parameters on training performance. Symmetry of updates becomes even more crucial for RNNs; already a few percent asymmetry results in an increase in the test error compared to the ideal case trained with floating point numbers. Furthermore, the input signal resolution to device arrays needs to be at least 7 bits for successful training. However, we show that a stochastic rounding scheme can reduce the input signal resolution back to 5 bits. Further, we find that RPU device variations and hardware noise are enough to mitigate overfitting, so that there is less need for using dropout. We note that the models trained here are roughly 1500 times larger than the fully connected network trained on MNIST dataset in terms of the total number of multiplication and summation operations performed per epoch. Thus, here we attempt to study the validity of the RPU approach for large scale networks.

READ FULL TEXT

page 3

page 5

research
05/22/2017

Training Deep Convolutional Neural Networks with Resistive Cross-Point Devices

In a previous work we have detailed the requirements to obtain a maximal...
research
09/17/2019

Algorithm for Training Neural Networks on Resistive Device Arrays

Hardware architectures composed of resistive cross-point device arrays c...
research
05/29/2019

Rethinking Full Connectivity in Recurrent Neural Networks

Recurrent neural networks (RNNs) are omnipresent in sequence modeling ta...
research
08/24/2016

Recurrent Neural Networks With Limited Numerical Precision

Recurrent Neural Networks (RNNs) produce state-of-art performance on man...
research
05/21/2018

Learning Device Models with Recurrent Neural Networks

Recurrent neural networks (RNNs) are powerful constructs capable of mode...
research
10/25/2017

Trace norm regularization and faster inference for embedded speech recognition RNNs

We propose and evaluate new techniques for compressing and speeding up d...
research
08/23/2019

Incremental Binarization On Recurrent Neural Networks For Single-Channel Source Separation

This paper proposes a Bitwise Gated Recurrent Unit (BGRU) network for th...

Please sign up or login with your details

Forgot password? Click here to reset