Learning Longer Memory in Recurrent Neural Networks

12/24/2014
by   Tomas Mikolov, et al.
0

Recurrent neural network is a powerful model that learns temporal patterns in sequential data. For a long time, it was believed that recurrent networks are difficult to train using simple optimizers, such as stochastic gradient descent, due to the so-called vanishing gradient problem. In this paper, we show that learning longer term patterns in real data, such as in natural language, is perfectly possible using gradient descent. This is achieved by using a slight structural modification of the simple recurrent neural network architecture. We encourage some of the hidden units to change their state slowly by making part of the recurrent weight matrix close to identity, thus forming kind of a longer term memory. We evaluate our model in language modeling experiments, where we obtain similar performance to the much more complex Long Short Term Memory (LSTM) networks (Hochreiter & Schmidhuber, 1997).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/22/2023

Recurrent Neural Networks and Long Short-Term Memory Networks: Tutorial and Survey

This is a tutorial paper on Recurrent Neural Network (RNN), Long Short-T...
research
03/21/2018

Exploring the Naturalness of Buggy Code with Recurrent Neural Networks

Statistical language models are powerful tools which have been used for ...
research
01/23/2022

An integrated recurrent neural network and regression model with spatial and climatic couplings for vector-borne disease dynamics

We developed an integrated recurrent neural network and nonlinear regres...
research
01/24/2018

PRNN: Recurrent Neural Network with Persistent Memory

Although Recurrent Neural Network (RNN) has been a powerful tool for mod...
research
05/28/2021

Deep Memory Update

Recurrent neural networks are key tools for sequential data processing. ...
research
05/13/2018

Low-pass Recurrent Neural Networks - A memory architecture for longer-term correlation discovery

Reinforcement learning (RL) agents performing complex tasks must be able...
research
04/03/2015

A Simple Way to Initialize Recurrent Networks of Rectified Linear Units

Learning long term dependencies in recurrent networks is difficult due t...

Please sign up or login with your details

Forgot password? Click here to reset