A Simple Way to Initialize Recurrent Networks of Rectified Linear Units

04/03/2015
by   Quoc V. Le, et al.
0

Learning long term dependencies in recurrent networks is difficult due to vanishing and exploding gradients. To overcome this difficulty, researchers have developed sophisticated optimization techniques and network architectures. In this paper, we propose a simpler solution that use recurrent neural networks composed of rectified linear units. Key to our solution is the use of the identity matrix or its scaled version to initialize the recurrent weight matrix. We find that our solution is comparable to LSTM on our four benchmarks: two toy problems involving long-range temporal structures, a large language modeling problem and a benchmark speech recognition problem.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/12/2015

Improving performance of recurrent neural network with relu nonlinearity

In recent years significant progress has been made in successfully train...
research
12/24/2014

Learning Longer Memory in Recurrent Neural Networks

Recurrent neural network is a powerful model that learns temporal patter...
research
01/25/2022

Do Neural Networks for Segmentation Understand Insideness?

The insideness problem is an aspect of image segmentation that consists ...
research
03/17/2018

Learning Long Term Dependencies via Fourier Recurrent Units

It is a known fact that training recurrent neural networks for tasks tha...
research
09/08/2017

Training RNNs as Fast as CNNs

Common recurrent neural network architectures scale poorly due to the in...
research
05/30/2019

A Lightweight Recurrent Network for Sequence Modeling

Recurrent networks have achieved great success on various sequential tas...
research
11/20/2015

Unitary Evolution Recurrent Neural Networks

Recurrent neural networks (RNNs) are notoriously difficult to train. Whe...

Please sign up or login with your details

Forgot password? Click here to reset