DeepAI AI Chat
Log In Sign Up

Uniform Generalization Bounds for Overparameterized Neural Networks

by   Sattar Vakili, et al.

An interesting observation in artificial neural networks is their favorable generalization error despite typically being extremely overparameterized. It is well known that classical statistical learning methods often result in vacuous generalization errors in the case of overparameterized neural networks. Adopting the recently developed Neural Tangent (NT) kernel theory, we prove uniform generalization bounds for overparameterized neural networks in kernel regimes, when the true data generating model belongs to the reproducing kernel Hilbert space (RKHS) corresponding to the NT kernel. Importantly, our bounds capture the exact error rates depending on the differentiability of the activation functions. In order to establish these bounds, we propose the information gain of the NT kernel as a measure of complexity of the learning problem. Our analysis uses a Mercer decomposition of the NT kernel in the basis of spherical harmonics and the decay rate of the corresponding eigenvalues. As a byproduct of our results, we show the equivalence between the RKHS corresponding to the NT kernel and its counterpart corresponding to the Matérn family of kernels, that induces a very general class of models. We further discuss the implications of our analysis for some recent results on the regret bounds for reinforcement learning algorithms, which use overparameterized neural networks.


Understanding neural networks with reproducing kernel Banach spaces

Characterizing the function spaces corresponding to neural networks can ...

Robust Generalization of Quadratic Neural Networks via Function Identification

A key challenge facing deep learning is that neural networks are often n...

A Revision of Neural Tangent Kernel-based Approaches for Neural Networks

Recent theoretical works based on the neural tangent kernel (NTK) have s...

Generalization Bounds on Multi-Kernel Learning with Mixed Datasets

This paper presents novel generalization bounds for the multi-kernel lea...

Group Invariance, Stability to Deformations, and Complexity of Deep Convolutional Representations

In this paper, we study deep signal representations that are invariant t...

Spectral Analysis of the Neural Tangent Kernel for Deep Residual Networks

Deep residual network architectures have been shown to achieve superior ...

A Principle of Least Action for the Training of Neural Networks

Neural networks have been achieving high generalization performance on m...