Can SGD Learn Recurrent Neural Networks with Provable Generalization?

02/04/2019
by   Zeyuan Allen-Zhu, et al.
0

Recurrent Neural Networks (RNNs) are among the most popular models in sequential data analysis. However, due to the complexity raised by recurrent structure, they remain one of the least theoretically understood neural-network models. In particular, existing generalization bounds for RNNs mostly scale exponentially with the length of the input sequence, which limited their practical implications. In this paper, we show that if the input labels are (approximately) realizable by certain classes of (non-linear) functions of the input sequences, then using the vanilla stochastic gradient descent (SGD), RNNs can actually learn them efficiently, meaning that both the training time and sample complexity only scale polynomially with the input length (or almost polynomially, depending on the classes of non-linearity).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/29/2021

On the Provable Generalization of Recurrent Neural Networks

Recurrent Neural Network (RNN) is a fundamental structure in deep learni...
research
01/29/2019

Sample Complexity Bounds for Recurrent Neural Networks with Application to Combinatorial Graph Problems

Learning to predict solutions to real-valued combinatorial graph problem...
research
05/23/2016

Path-Normalized Optimization of Recurrent Neural Networks with ReLU Activations

We investigate the parameter-space geometry of recurrent neural networks...
research
06/22/2020

Understanding Recurrent Neural Networks Using Nonequilibrium Response Theory

Recurrent neural networks (RNNs) are brain-inspired models widely used i...
research
05/17/2019

Adaptively Truncating Backpropagation Through Time to Control Gradient Bias

Truncated backpropagation through time (TBPTT) is a popular method for l...
research
09/09/2011

Learning Sequence Neighbourhood Metrics

Recurrent neural networks (RNNs) in combination with a pooling operator ...
research
10/28/2019

On Generalization Bounds of a Family of Recurrent Neural Networks

Recurrent Neural Networks (RNNs) have been widely applied to sequential ...

Please sign up or login with your details

Forgot password? Click here to reset