On the Effectiveness of Low-Rank Matrix Factorization for LSTM Model Compression

08/27/2019
by   Genta Indra Winata, et al.
0

Despite their ubiquity in NLP tasks, Long Short-Term Memory (LSTM) networks suffer from computational inefficiencies caused by inherent unparallelizable recurrences, which further aggravates as LSTMs require more parameters for larger memory capacity. In this paper, we propose to apply low-rank matrix factorization (MF) algorithms to different recurrences in LSTMs, and explore the effectiveness on different NLP tasks and model components. We discover that additive recurrence is more important than multiplicative recurrence, and explain this by identifying meaningful correlations between matrix norms and compression performance. We compare our approach across two settings: 1) compressing core LSTM recurrences in language models, 2) compressing biLSTM layers of ELMo evaluated in three downstream NLP tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/02/2019

Compression of Acoustic Event Detection Models with Low-rank Matrix Factorization and Quantization Training

In this paper, we present a compression approach based on the combinatio...
research
03/31/2017

Factorization tricks for LSTM networks

We present two simple ways of reducing the number of parameters and acce...
research
02/08/2023

Revisiting Offline Compression: Going Beyond Factorization-based Methods for Transformer Language Models

Recent transformer language models achieve outstanding results in many n...
research
11/01/2018

Online Embedding Compression for Text Classification using Low Rank Matrix Factorization

Deep learning models have become state of the art for natural language p...
research
06/25/2023

Low-Rank Prune-And-Factorize for Language Model Compression

The components underpinning PLMs – large weight matrices – were shown to...
research
10/02/2019

Distilled embedding: non-linear embedding factorization using knowledge distillation

Word-embeddings are a vital component of Natural Language Processing (NL...
research
01/24/2020

Compressing Language Models using Doped Kronecker Products

Kronecker Products (KP) have been used to compress IoT RNN Applications ...

Please sign up or login with your details

Forgot password? Click here to reset