Learning Intrinsic Sparse Structures within Long Short-Term Memory

09/15/2017
by   Wei Wen, et al.
0

Model compression is significant for the wide adoption of Recurrent Neural Networks (RNNs) in both user devices possessing limited resources and business clusters requiring quick responses to large-scale service requests. This work aims to learn structurally-sparse Long Short-Term Memory (LSTM) by reducing the sizes of basic structures within LSTM units, including input updates, gates, hidden states, cell states and outputs. Independently reducing the sizes of basic structures can result in inconsistent dimensions among them, and consequently, end up with invalid LSTM units. To overcome the problem, we propose Intrinsic Sparse Structures (ISS) in LSTMs. Removing a component of ISS will decrease the sizes of all basic structures by one simultaneously and thereby always maintain the dimension consistency. By learning ISS within LSTM units, the obtained LSTMs remain regular while having much smaller basic structures. Our method achieves 10.59x speedup in state-of-the-art LSTMs, without losing any perplexity of language modeling of Penn TreeBank dataset. It is also successfully evaluated through a compact model with only 2.69M weights for machine Question Answering of SQuAD dataset. Our source code is public available at https://github.com/wenwei202/iss-rnns

READ FULL TEXT
research
09/08/2014

Recurrent Neural Network Regularization

We present a simple regularization technique for Recurrent Neural Networ...
research
04/09/2016

Learning Compact Recurrent Neural Networks

Recurrent neural networks (RNNs), including long short-term memory (LSTM...
research
08/07/2017

Regularizing and Optimizing LSTM Language Models

Recurrent neural networks (RNNs), such as long short-term memory network...
research
01/26/2019

Intrinsically Sparse Long Short-Term Memory Networks

Long Short-Term Memory (LSTM) has achieved state-of-the-art performances...
research
09/28/2018

Learning Recurrent Binary/Ternary Weights

Recurrent neural networks (RNNs) have shown excellent performance in pro...
research
03/19/2019

IndyLSTMs: Independently Recurrent LSTMs

We introduce Independently Recurrent Long Short-term Memory cells: IndyL...
research
08/26/2019

Using LSTMs to Model the Java Programming Language

Recurrent neural networks (RNNs), specifically long-short term memory ne...

Please sign up or login with your details

Forgot password? Click here to reset