DeepAI AI Chat
Log In Sign Up

Can Gradient Descent Provably Learn Linear Dynamic Systems?

by   Lifu Wang, et al.

We study the learning ability of linear recurrent neural networks with gradient descent. We prove the first theoretical guarantee on linear RNNs with Gradient Descent to learn any stable linear dynamic system. We show that despite the non-convexity of the optimization loss if the width of the RNN is large enough (and the required width in hidden layers does not rely on the length of the input sequence), a linear RNN can provably learn any stable linear dynamic system with the sample and time complexity polynomial in 1/1-ρ_C where ρ_C is roughly the spectral radius of the stable system. Our results provide the first theoretical guarantee to learn a linear RNN and demonstrate how can the recurrent structure help to learn a dynamic system.


page 1

page 2

page 3

page 4


Width Provably Matters in Optimization for Deep Linear Neural Networks

We prove that for an L-layer fully-connected linear neural network, if t...

Mathematical Perspective of Machine Learning

We take a closer look at some theoretical challenges of Machine Learning...

An Approach to Stable Gradient Descent Adaptation of Higher-Order Neural Units

Stability evaluation of a weight-update system of higher-order neural un...

Provably Good Solutions to the Knapsack Problem via Neural Networks of Bounded Size

In view of the undisputed success of neural networks and due to the rema...

AdaLoss: A computationally-efficient and provably convergent adaptive gradient method

We propose a computationally-friendly adaptive learning rate schedule, "...

A Basic Recurrent Neural Network Model

We present a model of a basic recurrent neural network (or bRNN) that in...

A Review of Designs and Applications of Echo State Networks

Recurrent Neural Networks (RNNs) have demonstrated their outstanding abi...