Independently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN

03/13/2018
by   Shuai Li, et al.
0

Recurrent neural networks (RNNs) have been widely used for processing sequential data. However, RNNs are commonly difficult to train due to the well-known gradient vanishing and exploding problems and hard to learn long-term patterns. Long short-term memory (LSTM) and gated recurrent unit (GRU) were developed to address these problems, but the use of hyperbolic tangent and the sigmoid action functions results in gradient decay over layers. Consequently, construction of an efficiently trainable deep network is challenging. In addition, all the neurons in an RNN layer are entangled together and their behaviour is hard to interpret. To address these problems, a new type of RNN, referred to as independently recurrent neural network (IndRNN), is proposed in this paper, where neurons in the same layer are independent of each other and they are connected across layers. We have shown that an IndRNN can be easily regulated to prevent the gradient exploding and vanishing problems while allowing the network to learn long-term dependencies. Moreover, an IndRNN can work with non-saturated activation functions such as relu (rectified linear unit) and be still trained robustly. Multiple IndRNNs can be stacked to construct a network that is deeper than the existing RNNs. Experimental results have shown that the proposed IndRNN is able to process very long sequences (over 5000 time steps), can be used to construct very deep networks (21 layers used in the experiment) and still be trained robustly. Better performances have been achieved on various tasks by using IndRNNs compared with the traditional RNN and LSTM.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/11/2019

Deep Independently Recurrent Neural Network (IndRNN)

Recurrent neural networks (RNNs) are known to be difficult to train due ...
research
02/07/2017

Toward Abstraction from Multi-modal Data: Empirical Studies on Multiple Time-scale Recurrent Models

The abstraction tasks are challenging for multi- modal sequences as they...
research
04/18/2019

Language Modeling through Long Term Memory Network

Recurrent Neural Networks (RNN), Long Short-Term Memory Networks (LSTM),...
research
03/30/2020

SiTGRU: Single-Tunnelled Gated Recurrent Unit for Abnormality Detection

Abnormality detection is a challenging task due to the dependence on a s...
research
09/03/2019

EleAtt-RNN: Adding Attentiveness to Neurons in Recurrent Neural Networks

Recurrent neural networks (RNNs) are capable of modeling temporal depend...
research
07/12/2018

Adding Attentiveness to the Neurons in Recurrent Neural Networks

Recurrent neural networks (RNNs) are capable of modeling the temporal dy...
research
05/16/2023

Empirical Analysis of the Inductive Bias of Recurrent Neural Networks by Discrete Fourier Transform of Output Sequences

A unique feature of Recurrent Neural Networks (RNNs) is that it incremen...

Please sign up or login with your details

Forgot password? Click here to reset