Complex Unitary Recurrent Neural Networks using Scaled Cayley Transform

Recurrent neural networks (RNNs) have been successfully used on a wide range of sequential data problems. A well known difficulty in using RNNs is the vanishing or exploding gradient problem. Recently, there have been several different RNN architectures that try to mitigate this issue by maintaining an orthogonal or unitary recurrent weight matrix. One such architecture is the scaled Cayley orthogonal recurrent neural network (scoRNN) which parameterizes the orthogonal recurrent weight matrix through a scaled Cayley transform. This parametrization contains a diagonal scaling matrix consisting of positive or negative one entries that can not be optimized by gradient descent. Thus the scaling matrix is fixed before training and a hyperparameter is introduced to tune the matrix for each particular task. In this paper, we develop a unitary RNN architecture based on a complex scaled Cayley transform. Unlike the real orthogonal case, the transformation uses a diagonal scaling matrix consisting of entries on the complex unit circle which can be optimized using gradient descent and no longer requires the tuning of a hyperparameter. We also provide an analysis of a potential issue of the modReLU activiation function which is used in our work and several other unitary RNNs. In the experiments conducted, the scaled Cayley unitary recurrent neural network (scuRNN) achieves comparable or better results than scoRNN and other unitary RNNs without fixing the scaling matrix.

READ FULL TEXT
research
07/29/2017

Orthogonal Recurrent Neural Networks with Scaled Cayley Transform

Recurrent Neural Networks (RNNs) are designed to handle sequential data ...
research
11/18/2019

Eigenvalue Normalized Recurrent Neural Networks for Short Term Memory

Several variants of recurrent neural networks (RNNs) with orthogonal or ...
research
04/18/2017

Diagonal RNNs in Symbolic Music Modeling

In this paper, we propose a new Recurrent Neural Network (RNN) architect...
research
11/11/2019

Constructing Gradient Controllable Recurrent Neural Networks Using Hamiltonian Dynamics

Recurrent neural networks (RNNs) have gained a great deal of attention i...
research
10/08/2020

A Fully Tensorized Recurrent Neural Network

Recurrent neural networks (RNNs) are powerful tools for sequential model...
research
11/04/2015

adaQN: An Adaptive Quasi-Newton Algorithm for Training RNNs

Recurrent Neural Networks (RNNs) are powerful models that achieve except...
research
02/14/2023

Convolutional unitary or orthogonal recurrent neural networks

Recurrent neural networks are extremely powerful yet hard to train. One ...

Please sign up or login with your details

Forgot password? Click here to reset