Semi-tied Units for Efficient Gating in LSTM and Highway Networks

06/18/2018
by   Chao Zhang, et al.
0

Gating is a key technique used for integrating information from multiple sources by long short-term memory (LSTM) models and has recently also been applied to other models such as the highway network. Although gating is powerful, it is rather expensive in terms of both computation and storage as each gating unit uses a separate full weight matrix. This issue can be severe since several gates can be used together in e.g. an LSTM cell. This paper proposes a semi-tied unit (STU) approach to solve this efficiency issue, which uses one shared weight matrix to replace those in all the units in the same layer. The approach is termed "semi-tied" since extra parameters are used to separately scale each of the shared output values. These extra scaling factors are associated with the network activation functions and result in the use of parameterised sigmoid, hyperbolic tangent, and rectified linear unit functions. Speech recognition experiments using British English multi-genre broadcast data showed that using STUs can reduce the calculation and storage cost by a factor of three for highway networks and four for LSTMs, while giving similar word error rates to the original models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/22/2018

High Order Recurrent Neural Networks for Acoustic Modelling

Vanishing long-term gradients are a major issue in training standard rec...
research
09/19/2017

Language Modeling with Highway LSTM

Language models (LMs) based on Long Short Term Memory (LSTM) have shown ...
research
11/01/2018

Hierarchical Long Short-Term Concurrent Memory for Human Interaction Recognition

In this paper, we aim to address the problem of human interaction recogn...
research
10/25/2019

A memory enhanced LSTM for modeling complex temporal dependencies

In this paper, we present Gamma-LSTM, an enhanced long short term memory...
research
03/14/2018

C-LSTM: Enabling Efficient LSTM using Structured Compression Techniques on FPGAs

Recently, significant accuracy improvement has been achieved for acousti...
research
03/12/2019

End-To-End Speech Recognition Using A High Rank LSTM-CTC Based Model

Long Short Term Memory Connectionist Temporal Classification (LSTM-CTC) ...
research
05/30/2018

Grow and Prune Compact, Fast, and Accurate LSTMs

Long short-term memory (LSTM) has been widely used for sequential data m...

Please sign up or login with your details

Forgot password? Click here to reset