Compressing LSTM Networks by Matrix Product Operators

12/22/2020
by   Ze-Feng Gao, et al.
0

Long Short-Term Memory (LSTM) models are the building blocks of many state-of-the-art algorithms for Natural Language Processing (NLP). But, there are a large number of parameters in an LSTM model. This usually brings out a large amount of memory space needed for operating an LSTM model. Thus, an LSTM model usually requires a large amount of computational resources for training and predicting new data, suffering from computational inefficiencies. Here we propose an alternative LSTM model to reduce the number of parameters significantly by representing the weight parameters based on matrix product operators (MPO), which are used to characterize the local correlation in quantum states in physics. We further experimentally compare the compressed models based the MPO-LSTM model and the pruning method on sequence classification and sequence prediction tasks. The experimental results show that our proposed MPO-based method outperforms the pruning method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/31/2017

Factorization tricks for LSTM networks

We present two simple ways of reducing the number of parameters and acce...
research
11/18/2016

Visualizing and Understanding Curriculum Learning for Long Short-Term Memory Networks

Curriculum Learning emphasizes the order of training instances in a comp...
research
10/10/2020

A Model Compression Method with Matrix Product Operators for Speech Enhancement

The deep neural network (DNN) based speech enhancement approaches have a...
research
08/21/2019

Restricted Recurrent Neural Networks

Recurrent Neural Network (RNN) and its variations such as Long Short-Ter...
research
05/07/2018

MMDenseLSTM: An efficient combination of convolutional and recurrent neural networks for audio source separation

Deep neural networks have become an indispensable technique for audio so...
research
08/02/2023

A Transformer-based Prediction Method for Depth of Anesthesia During Target-controlled Infusion of Propofol and Remifentanil

Accurately predicting anesthetic effects is essential for target-control...
research
09/14/2017

Binary-decomposed DCNN for accelerating computation and compressing model without retraining

Recent trends show recognition accuracy increasing even more profoundly....

Please sign up or login with your details

Forgot password? Click here to reset