Theory of gating in recurrent neural networks

07/29/2020
by   Kamesh Krishnamurthy, et al.
0

RNNs are popular dynamical models, used for processing sequential data. Prior theoretical work in understanding the properties of RNNs has focused on models with additive interactions, where the input to a unit is a weighted sum of the output of the remaining units in network. However, there is ample evidence that neurons can have gating - i.e. multiplicative - interactions. Such gating interactions have significant effects on the collective dynamics of the network. Furthermore, the best performing RNNs in machine learning have gating interactions. Thus, gating interactions are beneficial for information processing and learning tasks. We develop a dynamical mean-field theory (DMFT) of gating to understand the dynamical regimes produced by gating. Our gated RNN reduces to the classical RNNs in certain limits and is closely related to popular gated models in machine learning. We use random matrix theory (RMT) to analytically characterize the spectrum of the Jacobian and show how gating produces slow modes and marginal stability. Thus, gating is a potential mechanism to implement computations involving line attractor dynamics. The long-time behavior of the gated network is studied using its Lyapunov spectrum, and the DMFT is used to provide an analytical prediction for the maximum Lyapunov exponent. We also show that gating gives rise to a novel, discontinuous transition to chaos, where the proliferation of critical points is decoupled with the appearance of chaotic dynamics; the nature of this chaotic state is characterized in detail. Using the DMFT and RMT, we produce phase diagrams for gated RNN. Finally, we address the gradients by leveraging the adjoint sensitivity framework to develop a DMFT for the gradients. The theory developed here sheds light on the rich dynamical behaviour produced by gating interactions and has implications for architectural choices and learning dynamics.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/14/2018

Dynamical Isometry and a Mean Field Theory of RNNs: Gating Enables Signal Propagation in Recurrent Neural Networks

Recurrent neural networks have gained widespread use in modeling sequenc...
research
01/31/2020

Gating creates slow modes and controls phase-space complexity in GRUs and LSTMs

Recurrent neural networks (RNNs) are powerful dynamical models for data ...
research
10/14/2021

How to train RNNs on chaotic data?

Recurrent neural networks (RNNs) are wide-spread machine learning tools ...
research
06/25/2020

On Lyapunov Exponents for RNNs: Understanding Information Propagation Using Dynamical Systems Tools

Recurrent neural networks (RNNs) have been successfully applied to a var...
research
08/23/2023

Characterising representation dynamics in recurrent neural networks for object recognition

Recurrent neural networks (RNNs) have yielded promising results for both...
research
10/25/2018

Learning with Interpretable Structure from RNN

In structure learning, the output is generally a structure that is used ...
research
05/21/2017

Recurrent Additive Networks

We introduce recurrent additive networks (RANs), a new gated RNN which i...

Please sign up or login with your details

Forgot password? Click here to reset