Finite Versus Infinite Neural Networks: an Empirical Study

07/31/2020
by   Jaehoon Lee, et al.
49

We perform a careful, thorough, and large scale empirical study of the correspondence between wide neural networks and kernel methods. By doing so, we resolve a variety of open questions related to the study of infinitely wide neural networks. Our experimental results include: kernel methods outperform fully-connected finite-width networks, but underperform convolutional finite width networks; neural network Gaussian process (NNGP) kernels frequently outperform neural tangent (NT) kernels; centered and ensembled finite networks have reduced posterior variance and behave more similarly to infinite networks; weight decay and the use of a large learning rate break the correspondence between finite and infinite networks; the NTK parameterization outperforms the standard parameterization for finite width networks; diagonal regularization of kernels acts similarly to early stopping; floating point precision limits kernel performance beyond a critical dataset size; regularized ZCA whitening improves accuracy; finite network performance depends non-monotonically on width in ways not captured by double descent phenomena; equivariance of CNNs is only beneficial for narrow networks far from the kernel regime. Our experiments additionally motivate an improved layer-wise scaling for weight decay which improves generalization in finite-width networks. Finally, we develop improved best practices for using NNGP and NT kernels for prediction, including a novel ensembling technique. Using these best practices we achieve state-of-the-art results on CIFAR-10 classification for kernels corresponding to each architecture class we consider.

READ FULL TEXT
research
01/21/2020

On the infinite width limit of neural networks with a standard parameterization

There are currently two parameterizations used to derive fixed kernels c...
research
01/12/2022

On neural network kernels and the storage capacity problem

In this short note, we reify the connection between work on the storage ...
research
07/21/2023

Local Kernel Renormalization as a mechanism for feature learning in overparametrized Convolutional Neural Networks

Feature learning, or the ability of deep neural networks to automaticall...
research
12/10/2021

Eigenspace Restructuring: a Principle of Space and Frequency in Neural Networks

Understanding the fundamental principles behind the massive success of n...
research
06/01/2021

Asymptotics of representation learning in finite Bayesian neural networks

Recent works have suggested that finite Bayesian neural networks may out...
research
03/09/2023

Kernel Regression with Infinite-Width Neural Networks on Millions of Examples

Neural kernels have drastically increased performance on diverse and non...
research
08/03/2021

Nonperturbative renormalization for the neural network-QFT correspondence

In a recent work arXiv:2008.08601, Halverson, Maiti and Stoner proposed ...

Please sign up or login with your details

Forgot password? Click here to reset