Highway State Gating for Recurrent Highway Networks: improving information flow through time

05/23/2018
by   Ron Shoham, et al.
0

Recurrent Neural Networks (RNNs) play a major role in the field of sequential learning, and have outperformed traditional algorithms on many benchmarks. Training deep RNNs still remains a challenge, and most of the state-of-the-art models are structured with a transition depth of 2-4 layers. Recurrent Highway Networks (RHNs) were introduced in order to tackle this issue. These have achieved state-of-the-art performance on a few benchmarks using a depth of 10 layers. However, the performance of this architecture suffers from a bottleneck, and ceases to improve when an attempt is made to add more layers. In this work, we analyze the causes for this, and postulate that the main source is the way that the information flows through time. We introduce a novel and simple variation for the RHN cell, called Highway State Gating (HSG), which allows adding more layers, while continuing to improve performance. By using a gating mechanism for the state, we allow the net to "choose" whether to pass information directly through time, or to gate it. This mechanism also allows the gradient to back-propagate directly through time and, therefore, results in a slightly faster convergence. We use the Penn Treebank (PTB) dataset as a platform for empirical proof of concept. Empirical results show that the improvement due to Highway State Gating is for all depths, and as the depth increases, the improvement also increases.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/22/2015

Training Very Deep Networks

Theoretical and empirical evidence indicates that the depth of neural ne...
research
05/03/2015

Highway Networks

There is plenty of theoretical and empirical evidence that depth of neur...
research
12/12/2017

Deep Echo State Network (DeepESN): A Brief Survey

The study of deep recurrent neural networks (RNNs) and, in particular, o...
research
11/03/2022

An Improved Time Feedforward Connections Recurrent Neural Networks

Recurrent Neural Networks (RNNs) have been widely applied to deal with t...
research
10/06/2017

Lattice Recurrent Unit: Improving Convergence and Statistical Efficiency for Sequence Modeling

Recurrent neural networks have shown remarkable success in modeling sequ...
research
01/03/2017

Shortcut Sequence Tagging

Deep stacked RNNs are usually hard to train. Adding shortcut connections...
research
12/09/2020

exploRNN: Understanding Recurrent Neural Networks through Visual Exploration

Due to the success of deep learning and its growing job market, students...

Please sign up or login with your details

Forgot password? Click here to reset