Dynamic Analysis and an Eigen Initializer for Recurrent Neural Networks

07/28/2023
by   Ran Dou, et al.
0

In recurrent neural networks, learning long-term dependency is the main difficulty due to the vanishing and exploding gradient problem. Many researchers are dedicated to solving this issue and they proposed many algorithms. Although these algorithms have achieved great success, understanding how the information decays remains an open problem. In this paper, we study the dynamics of the hidden state in recurrent neural networks. We propose a new perspective to analyze the hidden state space based on an eigen decomposition of the weight matrix. We start the analysis by linear state space model and explain the function of preserving information in activation functions. We provide an explanation for long-term dependency based on the eigen analysis. We also point out the different behavior of eigenvalues for regression tasks and classification tasks. From the observations on well-trained recurrent neural networks, we proposed a new initialization method for recurrent neural networks, which improves consistently performance. It can be applied to vanilla-RNN, LSTM, and GRU. We test on many datasets, such as Tomita Grammars, pixel-by-pixel MNIST datasets, and machine translation datasets (Multi30k). It outperforms the Xavier initializer and kaiming initializer as well as other RNN-only initializers like IRNN and sp-RNN in several tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/18/2019

Eigenvalue Normalized Recurrent Neural Networks for Short Term Memory

Several variants of recurrent neural networks (RNNs) with orthogonal or ...
research
04/16/2018

Particle-based pedestrian path prediction using LSTM-MDL models

Recurrent neural networks are able to learn complex long-term relationsh...
research
09/24/2021

Discovering Novel Customer Features with Recurrent Neural Networks for Personality Based Financial Services

The micro-segmentation of customers in the finance sector is a non-trivi...
research
02/09/2019

Contextual Recurrent Neural Networks

There is an implicit assumption that by unfolding recurrent neural netwo...
research
10/15/2020

What you need to know to train recurrent neural networks to make Flip Flops memories and more

Training neural networks to perform different tasks is relevant across v...
research
07/18/2018

General Value Function Networks

In this paper we show that restricting the representation-layer of a Rec...
research
05/04/2020

Neural Networks and Value at Risk

Utilizing a generative regime switching framework, we perform Monte-Carl...

Please sign up or login with your details

Forgot password? Click here to reset