Sampling-based Gradient Regularization for Capturing Long-Term Dependencies in Recurrent Neural Networks

06/24/2016
by   Artem Chernodub, et al.
0

Vanishing (and exploding) gradients effect is a common problem for recurrent neural networks with nonlinear activation functions which use backpropagation method for calculation of derivatives. Deep feedforward neural networks with many hidden layers also suffer from this effect. In this paper we propose a novel universal technique that makes the norm of the gradient stay in the suitable range. We construct a way to estimate a contribution of each training example to the norm of the long-term components of the target function s gradient. Using this subroutine we can construct mini-batches for the stochastic gradient descent (SGD) training that leads to high performance and accuracy of the trained network even for very complex tasks. We provide a straightforward mathematical estimation of minibatch s impact on for the gradient norm and prove its correctness theoretically. To check our framework experimentally we use some special synthetic benchmarks for testing RNNs on ability to capture long-term dependencies. Our network can detect links between events in the (temporal) sequence at the range approx. 100 and longer.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/22/2019

Towards Non-saturating Recurrent Units for Modelling Long-term Dependencies

Modelling long-term dependencies is a challenge for recurrent neural net...
research
01/31/2017

On orthogonality and learning recurrent networks with long term dependencies

It is well known that it is challenging to train deep neural networks an...
research
03/17/2018

Learning Long Term Dependencies via Fourier Recurrent Units

It is a known fact that training recurrent neural networks for tasks tha...
research
12/28/2017

Gradient Regularization Improves Accuracy of Discriminative Models

Regularizing the gradient norm of the output of a neural network with re...
research
03/01/2018

Learning Longer-term Dependencies in RNNs with Auxiliary Losses

Despite recent advances in training recurrent neural networks (RNNs), ca...
research
07/29/2019

RNNbow: Visualizing Learning via Backpropagation Gradients in Recurrent Neural Networks

We present RNNbow, an interactive tool for visualizing the gradient flow...
research
08/22/2019

RNNs Evolving in Equilibrium: A Solution to the Vanishing and Exploding Gradients

Recurrent neural networks (RNNs) are particularly well-suited for modeli...

Please sign up or login with your details

Forgot password? Click here to reset