Gradual Learning of Deep Recurrent Neural Networks

08/29/2017
by   Ziv Aharoni, et al.
0

Deep Recurrent Neural Networks (RNNs) achieve state-of-the-art results in many sequence-to-sequence tasks. However, deep RNNs are difficult to train and suffer from overfitting. We introduce a training method that trains the network gradually, and treats each layer individually, to achieve improved results in language modelling tasks. Training deep LSTM with Gradual Learning (GL) obtains perplexity of 61.7 on the Penn Treebank (PTB) corpus. As far as we know (as for the 20.05.2017), GL improves the best state-of-the-art performance by a single LSTM/RHN model on the word-level PTB dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/06/2017

Label-Dependencies Aware Recurrent Neural Networks

In the last few years, Recurrent Neural Networks (RNNs) have proved effe...
research
02/24/2016

Toward Mention Detection Robustness with Recurrent Neural Networks

One of the key challenges in natural language processing (NLP) is to yie...
research
12/06/2018

Layer Flexible Adaptive Computational Time for Recurrent Neural Networks

Deep recurrent neural networks show significant benefits in prediction t...
research
12/02/2019

Long Distance Relationships without Time Travel: Boosting the Performance of a Sparse Predictive Autoencoder in Sequence Modeling

In sequence learning tasks such as language modelling, Recurrent Neural ...
research
12/09/2020

exploRNN: Understanding Recurrent Neural Networks through Visual Exploration

Due to the success of deep learning and its growing job market, students...
research
06/16/2015

Author Identification using Multi-headed Recurrent Neural Networks

Recurrent neural networks (RNNs) are very good at modelling the flow of ...
research
02/22/2021

Parallelizing Legendre Memory Unit Training

Recently, a new recurrent neural network (RNN) named the Legendre Memory...

Please sign up or login with your details

Forgot password? Click here to reset