Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNNs

12/15/2016
by   Li Jing, et al.
0

Using unitary (instead of general) matrices in artificial neural networks (ANNs) is a promising way to solve the gradient explosion/vanishing problem, as well as to enable ANNs to learn long-term correlations in the data. This approach appears particularly promising for Recurrent Neural Networks (RNNs). In this work, we present a new architecture for implementing an Efficient Unitary Neural Network (EUNNs); its main advantages can be summarized as follows. Firstly, the representation capacity of the unitary space in an EUNN is fully tunable, ranging from a subspace of SU(N) to the entire unitary space. Secondly, the computational complexity for training an EUNN is merely O(1) per parameter. Finally, we test the performance of EUNNs on the standard copying task, the pixel-permuted MNIST digit recognition benchmark as well as the Speech Prediction Test (TIMIT). We find that our architecture significantly outperforms both other state-of-the-art unitary RNNs and the LSTM architecture, in terms of the final performance and/or the wall-clock training speed. EUNNs are thus promising alternatives to RNNs and LSTMs for a wide variety of applications.

READ FULL TEXT
research
01/11/2016

Investigating gated recurrent neural networks for speech synthesis

Recently, recurrent neural networks (RNNs) as powerful sequence models h...
research
07/09/2018

IGLOO: Slicing the Features Space to Represent Long Sequences

We introduce a new neural network architecture, IGLOO, which aims at pro...
research
08/12/2022

Orthogonal Gated Recurrent Unit with Neumann-Cayley Transformation

In recent years, using orthogonal matrices has been shown to be a promis...
research
05/16/2017

Subregular Complexity and Deep Learning

This paper argues that the judicial use of formal language theory and gr...
research
09/29/2017

Improving speech recognition by revising gated recurrent units

Speech recognition is largely taking advantage of deep learning, showing...
research
04/09/2021

DeepSITH: Efficient Learning via Decomposition of What and When Across Time Scales

Extracting temporal relationships over a range of scales is a hallmark o...
research
12/23/2019

Learning functionals via LSTM neural networks for predicting vessel dynamics in extreme sea states

Predicting motions of vessels in extreme sea states represents one of th...

Please sign up or login with your details

Forgot password? Click here to reset