Recurrent Neural Networks for Learning Long-term Temporal Dependencies with Reanalysis of Time Scale Representation

11/05/2021
by   Kentaro Ohno, et al.
0

Recurrent neural networks with a gating mechanism such as an LSTM or GRU are powerful tools to model sequential data. In the mechanism, a forget gate, which was introduced to control information flow in a hidden state in the RNN, has recently been re-interpreted as a representative of the time scale of the state, i.e., a measure how long the RNN retains information on inputs. On the basis of this interpretation, several parameter initialization methods to exploit prior knowledge on temporal dependencies in data have been proposed to improve learnability. However, the interpretation relies on various unrealistic assumptions, such as that there are no inputs after a certain time point. In this work, we reconsider this interpretation of the forget gate in a more realistic setting. We first generalize the existing theory on gated RNNs so that we can consider the case where inputs are successively given. We then argue that the interpretation of a forget gate as a temporal representation is valid when the gradient of loss with respect to the state decreases exponentially as time goes back. We empirically demonstrate that existing RNNs satisfy this gradient condition at the initial training phase on several tasks, which is in good agreement with previous initialization methods. On the basis of this finding, we propose an approach to construct new RNNs that can represent a longer time scale than conventional models, which will improve the learnability for long-term sequential data. We verify the effectiveness of our method by experiments with real-world datasets.

READ FULL TEXT

page 1

page 6

page 7

research
12/29/2017

Recent Advances in Recurrent Neural Networks

Recurrent neural networks (RNNs) are capable of learning features and lo...
research
08/10/2023

ReLU and Addition-based Gated RNN

We replace the multiplication and sigmoid function of the conventional r...
research
01/22/2019

Reducing state updates via Gaussian-gated LSTMs

Recurrent neural networks can be difficult to train on long sequence dat...
research
04/29/2019

Learning Longer-term Dependencies via Grouped Distributor Unit

Learning long-term dependencies still remains difficult for recurrent ne...
research
10/22/2019

Improving the Gating Mechanism of Recurrent Neural Networks

Gating mechanisms are widely used in neural network models, where they a...
research
02/22/2016

Recurrent Orthogonal Networks and Long-Memory Tasks

Although RNNs have been shown to be powerful tools for processing sequen...
research
10/04/2022

Fast Saturating Gate for Learning Long Time Scales with Recurrent Neural Networks

Gate functions in recurrent models, such as an LSTM and GRU, play a cent...

Please sign up or login with your details

Forgot password? Click here to reset