Intrinsically Sparse Long Short-Term Memory Networks

01/26/2019
by   Shiwei Liu, et al.
0

Long Short-Term Memory (LSTM) has achieved state-of-the-art performances on a wide range of tasks. Its outstanding performance is guaranteed by the long-term memory ability which matches the sequential data perfectly and the gating structure controlling the information flow. However, LSTMs are prone to be memory-bandwidth limited in realistic applications and need an unbearable period of training and inference time as the model size is ever-increasing. To tackle this problem, various efficient model compression methods have been proposed. Most of them need a big and expensive pre-trained model which is a nightmare for resource-limited devices where the memory budget is strictly limited. To remedy this situation, in this paper, we incorporate the Sparse Evolutionary Training (SET) procedure into LSTM, proposing a novel model dubbed SET-LSTM. Rather than starting with a fully-connected architecture, SET-LSTM has a sparse topology and dramatically fewer parameters in both phases, training and inference. Considering the specific architecture of LSTMs, we replace the LSTM cells and embedding layers with sparse structures and further on, use an evolutionary strategy to adapt the sparse connectivity to the data. Additionally, we find that SET-LSTM can provide many different good combinations of sparse connectivity to substitute the overparameterized optimization problem of dense neural networks. Evaluated on four sentiment analysis classification datasets, the results demonstrate that our proposed model is able to achieve usually better performance than its fully connected counterpart while having less than 4% of its parameters.

READ FULL TEXT

page 3

page 7

page 8

research
01/02/2019

Performance of Three Slim Variants of The Long Short-Term Memory (LSTM) Layer

The Long Short-Term Memory (LSTM) layer is an important advancement in t...
research
03/16/2015

Long Short-Term Memory Over Tree Structures

The chain-structured long short-term memory (LSTM) has showed to be effe...
research
09/15/2017

Learning Intrinsic Sparse Structures within Long Short-Term Memory

Model compression is significant for the wide adoption of Recurrent Neur...
research
03/19/2019

NeuralHydrology - Interpreting LSTMs in Hydrology

Despite the huge success of Long Short-Term Memory networks, their appli...
research
03/14/2018

C-LSTM: Enabling Efficient LSTM using Structured Compression Techniques on FPGAs

Recently, significant accuracy improvement has been achieved for acousti...
research
04/19/2020

Multi-level Binarized LSTM in EEG Classification for Wearable Devices

Long Short-Term Memory (LSTM) is widely used in various sequential appli...
research
12/13/2018

Code Failure Prediction and Pattern Extraction using LSTM Networks

In this paper, we use a well-known Deep Learning technique called Long S...

Please sign up or login with your details

Forgot password? Click here to reset