Neural Networks Compression for Language Modeling

08/20/2017
by   Artem M. Grachev, et al.
0

In this paper, we consider several compression techniques for the language modeling problem based on recurrent neural networks (RNNs). It is known that conventional RNNs, e.g, LSTM-based networks in language modeling, are characterized with either high space complexity or substantial inference time. This problem is especially crucial for mobile applications, in which the constant interaction with the remote server is inappropriate. By using the Penn Treebank (PTB) dataset we compare pruning, quantization, low-rank factorization, tensor train decomposition for LSTM networks in terms of model size and suitability for fast inference.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/06/2019

Compression of Recurrent Neural Networks for Efficient Language Modeling

Recurrent neural networks have proved to be an effective method for stat...
research
10/29/2018

Counting in Language with RNNs

In this paper we examine a possible reason for the LSTM outperforming th...
research
02/27/2019

Alternating Synthetic and Real Gradients for Neural Language Modeling

Training recurrent neural networks (RNNs) with backpropagation through t...
research
08/21/2020

Kronecker CP Decomposition with Fast Multiplication for Compressing RNNs

Recurrent neural networks (RNNs) are powerful in the tasks oriented to s...
research
03/02/2020

Tensor Networks for Language Modeling

The tensor network formalism has enjoyed over two decades of success in ...
research
07/11/2018

Iterative evaluation of LSTM cells

In this work we present a modification in the conventional flow of infor...
research
06/17/2019

Structured Pruning of Recurrent Neural Networks through Neuron Selection

Recurrent neural networks (RNNs) have recently achieved remarkable succe...

Please sign up or login with your details

Forgot password? Click here to reset