Fast Saturating Gate for Learning Long Time Scales with Recurrent Neural Networks

10/04/2022
by   Kentaro Ohno, et al.
0

Gate functions in recurrent models, such as an LSTM and GRU, play a central role in learning various time scales in modeling time series data by using a bounded activation function. However, it is difficult to train gates to capture extremely long time scales due to gradient vanishing of the bounded function for large inputs, which is known as the saturation problem. We closely analyze the relation between saturation of the gate function and efficiency of the training. We prove that the gradient vanishing of the gate function can be mitigated by accelerating the convergence of the saturating function, i.e., making the output of the function converge to 0 or 1 faster. Based on the analysis results, we propose a gate function called fast gate that has a doubly exponential convergence rate with respect to inputs by simple function composition. We empirically show that our method outperforms previous methods in accuracy and computational efficiency on benchmark tasks involving extremely long time scales.

READ FULL TEXT
research
05/25/2019

Bivariate Beta LSTM

Long Short-Term Memory (LSTM) infers the long term dependency through a ...
research
07/11/2018

Recurrent Neural Networks with Flexible Gates using Kernel Activation Functions

Gated recurrent neural networks have achieved remarkable results in the ...
research
10/06/2017

Lattice Recurrent Unit: Improving Convergence and Statistical Efficiency for Sequence Modeling

Recurrent neural networks have shown remarkable success in modeling sequ...
research
01/22/2019

Reducing state updates via Gaussian-gated LSTMs

Recurrent neural networks can be difficult to train on long sequence dat...
research
11/05/2021

Recurrent Neural Networks for Learning Long-term Temporal Dependencies with Reanalysis of Time Scale Representation

Recurrent neural networks with a gating mechanism such as an LSTM or GRU...
research
10/06/2018

h-detach: Modifying the LSTM Gradient Towards Better Optimization

Recurrent neural networks are known for their notorious exploding and va...
research
06/18/2018

Where to Go Next: A Spatio-temporal LSTM model for Next POI Recommendation

Next Point-of-Interest (POI) recommendation is of great value for both l...

Please sign up or login with your details

Forgot password? Click here to reset