A Simple Way to Initialize Recurrent Networks of Rectified Linear Units

04/03/2015
by   Quoc V. Le, et al.
0

Learning long term dependencies in recurrent networks is difficult due to vanishing and exploding gradients. To overcome this difficulty, researchers have developed sophisticated optimization techniques and network architectures. In this paper, we propose a simpler solution that use recurrent neural networks composed of rectified linear units. Key to our solution is the use of the identity matrix or its scaled version to initialize the recurrent weight matrix. We find that our solution is comparable to LSTM on our four benchmarks: two toy problems involving long-range temporal structures, a large language modeling problem and a benchmark speech recognition problem.

READ FULL TEXT

page 1

page 2

page 3

page 4

11/12/2015

Improving performance of recurrent neural network with relu nonlinearity

In recent years significant progress has been made in successfully train...
12/24/2014

Learning Longer Memory in Recurrent Neural Networks

Recurrent neural network is a powerful model that learns temporal patter...
01/25/2022

Do Neural Networks for Segmentation Understand Insideness?

The insideness problem is an aspect of image segmentation that consists ...
05/30/2019

A Lightweight Recurrent Network for Sequence Modeling

Recurrent networks have achieved great success on various sequential tas...
03/17/2018

Learning Long Term Dependencies via Fourier Recurrent Units

It is a known fact that training recurrent neural networks for tasks tha...
01/31/2017

On orthogonality and learning recurrent networks with long term dependencies

It is well known that it is challenging to train deep neural networks an...
09/08/2017

Training RNNs as Fast as CNNs

Common recurrent neural network architectures scale poorly due to the in...

Code Repositories