Towards Non-saturating Recurrent Units for Modelling Long-term Dependencies

01/22/2019
by   Sarath Chandar, et al.
0

Modelling long-term dependencies is a challenge for recurrent neural networks. This is primarily due to the fact that gradients vanish during training, as the sequence length increases. Gradients can be attenuated by transition operators and are attenuated or dropped by activation functions. Canonical architectures like LSTM alleviate this issue by skipping information through a memory mechanism. We propose a new recurrent architecture (Non-saturating Recurrent Unit; NRU) that relies on a memory mechanism but forgoes both saturating activation functions and saturating gates, in order to further alleviate vanishing gradients. In a series of synthetic and real world tasks, we demonstrate that the proposed model is the only model that performs among the top 2 models across all tasks with and without long-term dependencies, when compared against a range of other architectures.

READ FULL TEXT
research
02/22/2018

High Order Recurrent Neural Networks for Acoustic Modelling

Vanishing long-term gradients are a major issue in training standard rec...
research
03/17/2018

Learning Long Term Dependencies via Fourier Recurrent Units

It is a known fact that training recurrent neural networks for tasks tha...
research
06/24/2016

Sampling-based Gradient Regularization for Capturing Long-Term Dependencies in Recurrent Neural Networks

Vanishing (and exploding) gradients effect is a common problem for recur...
research
04/17/2018

PredRNN++: Towards A Resolution of the Deep-in-Time Dilemma in Spatiotemporal Predictive Learning

We present PredRNN++, an improved recurrent network for video predictive...
research
09/14/2021

Oscillatory Fourier Neural Network: A Compact and Efficient Architecture for Sequential Processing

Tremendous progress has been made in sequential processing with the rece...
research
11/09/2015

Deep Recurrent Neural Networks for Sequential Phenotype Prediction in Genomics

In analyzing of modern biological data, we are often dealing with ill-po...
research
07/27/2022

Explain My Surprise: Learning Efficient Long-Term Memory by Predicting Uncertain Outcomes

In many sequential tasks, a model needs to remember relevant events from...

Please sign up or login with your details

Forgot password? Click here to reset