Recurrent Additive Networks

05/21/2017
by   Kenton Lee, et al.
0

We introduce recurrent additive networks (RANs), a new gated RNN which is distinguished by the use of purely additive latent state updates. At every time step, the new state is computed as a gated component-wise sum of the input and the previous state, without any of the non-linearities commonly used in RNN transition dynamics. We formally show that RAN states are weighted sums of the input vectors, and that the gates only contribute to computing the weights of these sums. Despite this relatively simple functional form, experiments demonstrate that RANs perform on par with LSTMs on benchmark language modeling problems. This result shows that many of the non-linear computations in LSTMs and related networks are not essential, at least for the problems we consider, and suggests that the gates are doing more of the computational work than previously understood.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/09/2018

Long Short-Term Memory as a Dynamically Computed Element-wise Weighted Sum

LSTMs were introduced to combat vanishing gradients in simple RNNs by au...
research
11/13/2019

Structured Sparsification of Gated Recurrent Neural Networks

Recently, a lot of techniques were developed to sparsify the weights of ...
research
02/26/2020

Refined Gate: A Simple and Effective Gating Mechanism for Recurrent Units

Recurrent neural network (RNN) has been widely studied in sequence learn...
research
03/26/2017

Learning Simpler Language Models with the Differential State Framework

Learning useful information across long time lags is a critical and diff...
research
01/31/2020

Gating creates slow modes and controls phase-space complexity in GRUs and LSTMs

Recurrent neural networks (RNNs) are powerful dynamical models for data ...
research
09/15/2021

Tied Reduced RNN-T Decoder

Previous works on the Recurrent Neural Network-Transducer (RNN-T) models...
research
07/29/2020

Theory of gating in recurrent neural networks

RNNs are popular dynamical models, used for processing sequential data. ...

Please sign up or login with your details

Forgot password? Click here to reset