Neural Tangent Kernel Eigenvalues Accurately Predict Generalization

10/08/2021
by   James B. Simon, et al.
0

Finding a quantitative theory of neural network generalization has long been a central goal of deep learning research. We extend recent results to demonstrate that, by examining the eigensystem of a neural network's "neural tangent kernel", one can predict its generalization performance when learning arbitrary functions. Our theory accurately predicts not only test mean-squared-error but all first- and second-order statistics of the network's learned function. Furthermore, using a measure quantifying the "learnability" of a given target function, we prove a new "no-free-lunch" theorem characterizing a fundamental tradeoff in the inductive bias of wide neural networks: improving a network's generalization for a given target function must worsen its generalization for orthogonal functions. We further demonstrate the utility of our theory by analytically predicting two surprising phenomena - worse-than-chance generalization on hard-to-learn functions and nonmonotonic error curves in the small data regime - which we subsequently observe in experiments. Though our theory is derived for infinite-width architectures, we find it agrees with networks as narrow as width 20, suggesting it is predictive of generalization in practical neural networks. Code replicating our results is available at https://github.com/james-simon/eigenlearning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/17/2022

Fast Finite Width Neural Tangent Kernel

The Neural Tangent Kernel (NTK), defined as Θ_θ^f(x_1, x_2) = [∂ f(θ, x_...
research
05/17/2023

Sparsity-depth Tradeoff in Infinitely Wide Deep Neural Networks

We investigate how sparse neural activity affects the generalization per...
research
01/12/2022

Implicit Bias of MSE Gradient Optimization in Underparameterized Neural Networks

We study the dynamics of a neural network in function space when optimiz...
research
02/01/2022

Datamodels: Predicting Predictions from Training Data

We present a conceptual framework, datamodeling, for analyzing the behav...
research
07/25/2023

Modify Training Directions in Function Space to Reduce Generalization Error

We propose theoretical analyses of a modified natural gradient descent m...
research
03/15/2022

Can Neural Nets Learn the Same Model Twice? Investigating Reproducibility and Double Descent from the Decision Boundary Perspective

We discuss methods for visualizing neural network decision boundaries an...

Please sign up or login with your details

Forgot password? Click here to reset