Improving the Gating Mechanism of Recurrent Neural Networks

10/22/2019
by   Albert Gu, et al.
30

Gating mechanisms are widely used in neural network models, where they allow gradients to backpropagate more easily through depth or time. However, their saturation property introduces problems of its own. For example, in recurrent models these gates need to have outputs near 1 to propagate information over long time-delays, which requires them to operate in their saturation regime and hinders gradient-based learning of the gate mechanism. We address this problem by deriving two synergistic modifications to the standard gating mechanism that are easy to implement, introduce no additional hyperparameters, and improve learnability of the gates when they are close to saturation. We show how these changes are related to and improve on alternative recently proposed gating mechanisms such as chrono-initialization and Ordered Neurons. Empirically, our simple gating mechanisms robustly improve the performance of recurrent models on a range of applications, including synthetic memorization tasks, sequential image classification, language modeling, and reinforcement learning, particularly when long-term dependencies are involved.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/23/2018

Can recurrent neural networks warp time?

Successful recurrent models such as long short-term memories (LSTMs) and...
research
05/12/2021

Slower is Better: Revisiting the Forgetting Mechanism in LSTM for Slower Information Decay

Sequential information contains short- to long-range dependencies; howev...
research
02/26/2020

Refined Gate: A Simple and Effective Gating Mechanism for Recurrent Units

Recurrent neural network (RNN) has been widely studied in sequence learn...
research
11/05/2021

Recurrent Neural Networks for Learning Long-term Temporal Dependencies with Reanalysis of Time Scale Representation

Recurrent neural networks with a gating mechanism such as an LSTM or GRU...
research
07/11/2018

Recurrent Neural Networks with Flexible Gates using Kernel Activation Functions

Gated recurrent neural networks have achieved remarkable results in the ...
research
06/04/2020

A Novel Update Mechanism for Q-Networks Based On Extreme Learning Machines

Reinforcement learning is a popular machine learning paradigm which can ...
research
08/10/2023

ReLU and Addition-based Gated RNN

We replace the multiplication and sigmoid function of the conventional r...

Please sign up or login with your details

Forgot password? Click here to reset