DeepAI AI Chat
Log In Sign Up

HiPPO: Recurrent Memory with Optimal Polynomial Projections

by   Albert Gu, et al.

A central problem in learning from sequential data is representing cumulative history in an incremental fashion as more data is processed. We introduce a general framework (HiPPO) for the online compression of continuous signals and discrete time series by projection onto polynomial bases. Given a measure that specifies the importance of each time step in the past, HiPPO produces an optimal solution to a natural online function approximation problem. As special cases, our framework yields a short derivation of the recent Legendre Memory Unit (LMU) from first principles, and generalizes the ubiquitous gating mechanism of recurrent neural networks such as GRUs. This formal framework yields a new memory update mechanism (HiPPO-LegS) that scales through time to remember all history, avoiding priors on the timescale. HiPPO-LegS enjoys the theoretical benefits of timescale robustness, fast updates, and bounded gradients. By incorporating the memory dynamics into recurrent neural networks, HiPPO RNNs can empirically capture complex temporal dependencies. On the benchmark permuted MNIST dataset, HiPPO-LegS sets a new state-of-the-art accuracy of 98.3 robustness to out-of-distribution timescales and missing data, HiPPO-LegS outperforms RNN and neural ODE baselines by 25-40


page 1

page 2

page 3

page 4


Zero-shot and few-shot time series forecasting with ordinal regression recurrent neural networks

Recurrent neural networks (RNNs) are state-of-the-art in several sequent...

Robust Learning of Recurrent Neural Networks in Presence of Exogenous Noise

Recurrent Neural networks (RNN) have shown promising potential for learn...

Recurrent Neural Networks in the Eye of Differential Equations

To understand the fundamental trade-offs between training stability, tem...

Optimal Kronecker-Sum Approximation of Real Time Recurrent Learning

One of the central goals of Recurrent Neural Networks (RNNs) is to learn...

Time-Warping Invariant Quantum Recurrent Neural Networks via Quantum-Classical Adaptive Gating

Adaptive gating plays a key role in temporal data processing via classic...

DeepSITH: Efficient Learning via Decomposition of What and When Across Time Scales

Extracting temporal relationships over a range of scales is a hallmark o...

Code Repositories