Eigenvalue Normalized Recurrent Neural Networks for Short Term Memory

11/18/2019
by   Kyle Helfrich, et al.
0

Several variants of recurrent neural networks (RNNs) with orthogonal or unitary recurrent matrices have recently been developed to mitigate the vanishing/exploding gradient problem and to model long-term dependencies of sequences. However, with the eigenvalues of the recurrent matrix on the unit circle, the recurrent state retains all input information which may unnecessarily consume model capacity. In this paper, we address this issue by proposing an architecture that expands upon an orthogonal/unitary RNN with a state that is generated by a recurrent matrix with eigenvalues in the unit disc. Any input to this state dissipates in time and is replaced with new inputs, simulating short-term memory. A gradient descent algorithm is derived for learning such a recurrent matrix. The resulting method, called the Eigenvalue Normalized RNN (ENRNN), is shown to be highly competitive in several experiments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/22/2023

Recurrent Neural Networks and Long Short-Term Memory Networks: Tutorial and Survey

This is a tutorial paper on Recurrent Neural Network (RNN), Long Short-T...
research
07/28/2023

Dynamic Analysis and an Eigen Initializer for Recurrent Neural Networks

In recurrent neural networks, learning long-term dependency is the main ...
research
06/20/2019

The trade-off between long-term memory and smoothness for recurrent networks

Training recurrent neural networks (RNNs) that possess long-term memory ...
research
11/09/2018

Complex Unitary Recurrent Neural Networks using Scaled Cayley Transform

Recurrent neural networks (RNNs) have been successfully used on a wide r...
research
03/11/2016

Determination of the edge of criticality in echo state networks through Fisher information maximization

It is a widely accepted fact that the computational capability of recurr...
research
09/26/2018

Batch-normalized Recurrent Highway Networks

Gradient control plays an important role in feed-forward networks applie...
research
05/28/2019

Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamics

A recent strategy to circumvent the exploding and vanishing gradient pro...

Please sign up or login with your details

Forgot password? Click here to reset