Improving performance of recurrent neural network with relu nonlinearity

11/12/2015
by   Sachin S. Talathi, et al.
0

In recent years significant progress has been made in successfully training recurrent neural networks (RNNs) on sequence learning problems involving long range temporal dependencies. The progress has been made on three fronts: (a) Algorithmic improvements involving sophisticated optimization techniques, (b) network design involving complex hidden layer nodes and specialized recurrent layer connections and (c) weight initialization methods. In this paper, we focus on recently proposed weight initialization with identity matrix for the recurrent weights in a RNN. This initialization is specifically proposed for hidden nodes with Rectified Linear Unit (ReLU) non linearity. We offer a simple dynamical systems perspective on weight initialization process, which allows us to propose a modified weight initialization strategy. We show that this initialization technique leads to successfully training RNNs composed of ReLUs. We demonstrate that our proposal produces comparable or better solution for three toy problems involving long range temporal structure: the addition problem, the multiplication problem and the MNIST classification problem using sequence of pixels. In addition, we present results for a benchmark action recognition problem.

READ FULL TEXT

page 7

page 8

page 10

research
04/03/2015

A Simple Way to Initialize Recurrent Networks of Rectified Linear Units

Learning long term dependencies in recurrent networks is difficult due t...
research
03/11/2023

Resurrecting Recurrent Neural Networks for Long Sequences

Recurrent Neural Networks (RNNs) offer fast inference on long sequences ...
research
05/23/2016

Path-Normalized Optimization of Recurrent Neural Networks with ReLU Activations

We investigate the parameter-space geometry of recurrent neural networks...
research
10/08/2019

Inferring Dynamical Systems with Long-Range Dependencies through Line Attractor Regularization

Vanilla RNN with ReLU activation have a simple structure that is amenabl...
research
11/04/2019

Supervised level-wise pretraining for recurrent neural network initialization in multi-class classification

Recurrent Neural Networks (RNNs) can be seriously impacted by the initia...
research
11/17/2019

Multi-Zone Unit for Recurrent Neural Networks

Recurrent neural networks (RNNs) have been widely used to deal with sequ...
research
12/16/2022

Preventing RNN from Using Sequence Length as a Feature

Recurrent neural networks are deep learning topologies that can be train...

Please sign up or login with your details

Forgot password? Click here to reset