Orthogonal Recurrent Neural Networks with Scaled Cayley Transform

07/29/2017
by   Kyle Helfrich, et al.
0

Recurrent Neural Networks (RNNs) are designed to handle sequential data but suffer from vanishing or exploding gradients. Recent work on Unitary Recurrent Neural Networks (uRNNs) have been used to address this issue and in some cases, exceed the capabilities of Long Short-Term Memory networks (LSTMs). We propose a simpler and novel update scheme to maintain orthogonal recurrent weight matrices without using complex valued matrices. This is done by parametrizing with a skew-symmetric matrix using the Cayley transform. Such a parametrization is unable to represent matrices with negative one eigenvalues, but this limitation is overcome by scaling the recurrent weight matrix by a diagonal matrix consisting of ones and negative ones. The proposed training scheme involves a straightforward gradient calculation and update step. In several experiments, the proposed scaled Cayley orthogonal recurrent neural network (scoRNN) achieves superior results with fewer trainable parameters than other unitary RNNs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/09/2018

Complex Unitary Recurrent Neural Networks using Scaled Cayley Transform

Recurrent neural networks (RNNs) have been successfully used on a wide r...
research
08/12/2022

Orthogonal Gated Recurrent Unit with Neumann-Cayley Transformation

In recent years, using orthogonal matrices has been shown to be a promis...
research
02/14/2023

Convolutional unitary or orthogonal recurrent neural networks

Recurrent neural networks are extremely powerful yet hard to train. One ...
research
12/31/2021

Training and Generating Neural Networks in Compressed Weight Space

The inputs and/or outputs of some neural nets are weight matrices of oth...
research
10/31/2016

Full-Capacity Unitary Recurrent Neural Networks

Recurrent neural networks are powerful models for processing sequential ...
research
04/18/2020

CWY Parametrization for Scalable Learning of Orthogonal and Stiefel Matrices

In this paper we propose a new approach for optimization over orthogonal...
research
11/20/2015

Unitary Evolution Recurrent Neural Networks

Recurrent neural networks (RNNs) are notoriously difficult to train. Whe...

Please sign up or login with your details

Forgot password? Click here to reset