Recurrent Neural Networks in the Eye of Differential Equations

04/29/2019
by   Murphy Yuezhen Niu, et al.
0

To understand the fundamental trade-offs between training stability, temporal dynamics and architectural complexity of recurrent neural networks (RNNs), we directly analyze RNN architectures using numerical methods of ordinary differential equations (ODEs). We define a general family of RNNs--the ODERNNs--by relating the composition rules of RNNs to integration methods of ODEs at discrete time steps. We show that the degree of RNN's functional nonlinearity n and the range of its temporal memory t can be mapped to the corresponding stage of Runge-Kutta recursion and the order of time-derivative of the ODEs. We prove that popular RNN architectures, such as LSTM and URNN, fit into different orders of n-t-ODERNNs. This exact correspondence between RNN and ODE helps us to establish the sufficient conditions for RNN training stability and facilitates more flexible top-down designs of new RNN architectures using large varieties of toolboxes from numerical integration of ODEs. We provide such an example: Quantum-inspired Universal computing Neural Network (QUNN), which reduces the required number of training parameters from polynomial in both data length and temporal memory length to only linear in temporal memory length.

READ FULL TEXT
research
09/04/2017

DR-RNN: A deep residual recurrent neural network for model reduction

We introduce a deep residual recurrent neural network (DR-RNN) as an eff...
research
05/11/2023

Numerical Stability for Differential Equations with Memory

In this work, we systematically investigate linear multi-step methods fo...
research
07/19/2019

Universality and individuality in neural dynamics across large populations of recurrent networks

Task-based modeling with recurrent neural networks (RNNs) has emerged as...
research
06/24/2022

From Tensor Network Quantum States to Tensorial Recurrent Neural Networks

We show that any matrix product state (MPS) can be exactly represented b...
research
04/28/2023

Temporal Subsampling Diminishes Small Spatial Scales in Recurrent Neural Network Emulators of Geophysical Turbulence

The immense computational cost of traditional numerical weather and clim...
research
04/13/2016

Bridging the Gaps Between Residual Learning, Recurrent Neural Networks and Visual Cortex

We discuss relations between Residual Networks (ResNet), Recurrent Neura...
research
08/17/2020

HiPPO: Recurrent Memory with Optimal Polynomial Projections

A central problem in learning from sequential data is representing cumul...

Please sign up or login with your details

Forgot password? Click here to reset