Slim LSTM networks: LSTM_6 and LSTM_C6

01/18/2019
by   Atra Akandeh, et al.
0

We have shown previously that our parameter-reduced variants of Long Short-Term Memory (LSTM) Recurrent Neural Networks (RNN) are comparable in performance to the standard LSTM RNN on the MNIST dataset. In this study, we show that this is also the case for two diverse benchmark datasets, namely, the review sentiment IMDB and the 20 Newsgroup datasets. Specifically, we focus on two of the simplest variants, namely LSTM_6 (i.e., standard LSTM with three constant fixed gates) and LSTM_C6 (i.e., LSTM_6 with further reduced cell body input block). We demonstrate that these two aggressively reduced-parameter variants are competitive with the standard LSTM when hyper-parameters, e.g., learning parameter, number of hidden units and gate constants are set properly. These architectures enable speeding up training computations and hence, these networks would be more suitable for online training and inference onto portable devices with relatively limited computational resources.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/14/2017

Simplified Long Short-term Memory Recurrent Neural Networks: part I

We present five variants of the standard Long Short-term Memory (LSTM) r...
research
10/16/2018

Reduced-Gate Convolutional LSTM Using Predictive Coding for Spatiotemporal Prediction

Spatiotemporal sequence prediction is an important problem in deep learn...
research
07/05/2021

A comparison of LSTM and GRU networks for learning symbolic sequences

We explore relations between the hyper-parameters of a recurrent neural ...
research
08/11/2015

Benchmarking of LSTM Networks

LSTM (Long Short-Term Memory) recurrent neural networks have been highly...
research
08/09/2018

Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) Network

Because of their effectiveness in broad practical applications, LSTM net...
research
06/24/2015

Parallel Multi-Dimensional LSTM, With Application to Fast Biomedical Volumetric Image Segmentation

Convolutional Neural Networks (CNNs) can be shifted across 2D images or ...
research
12/12/2016

Empirical Evaluation of A New Approach to Simplifying Long Short-term Memory (LSTM)

The standard LSTM, although it succeeds in the modeling long-range depen...

Please sign up or login with your details

Forgot password? Click here to reset