Limitations of the NTK for Understanding Generalization in Deep Learning

06/20/2022
by   Nikhil Vyas, et al.
0

The “Neural Tangent Kernel” (NTK) (Jacot et al 2018), and its empirical variants have been proposed as a proxy to capture certain behaviors of real neural networks. In this work, we study NTKs through the lens of scaling laws, and demonstrate that they fall short of explaining important aspects of neural network generalization. In particular, we demonstrate realistic settings where finite-width neural networks have significantly better data scaling exponents as compared to their corresponding empirical and infinite NTKs at initialization. This reveals a more fundamental difference between the real networks and NTKs, beyond just a few percentage points of test accuracy. Further, we show that even if the empirical NTK is allowed to be pre-trained on a constant number of samples, the kernel scaling does not catch up to the neural network scaling. Finally, we show that the empirical NTK continues to evolve throughout most of the training, in contrast with prior work which suggests that it stabilizes after a few epochs of training. Altogether, our work establishes concrete limitations of the NTK approach in understanding generalization of real networks on natural datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/22/2022

Learning Deep Neural Networks by Iterative Linearisation

The excellent real-world performance of deep neural networks has receive...
research
06/12/2021

What can linearized neural networks actually say about generalization?

For certain infinitely-wide neural networks, the neural tangent kernel (...
research
10/21/2022

Evolution of Neural Tangent Kernels under Benign and Adversarial Training

Two key challenges facing modern deep learning are mitigating deep netwo...
research
02/12/2021

Explaining Neural Scaling Laws

The test loss of well-trained neural networks often follows precise powe...
research
10/02/2020

On the linearity of large non-linear models: when and why the tangent kernel is constant

The goal of this work is to shed light on the remarkable phenomenon of t...
research
08/29/2022

Neural Tangent Kernel: A Survey

A seminal work [Jacot et al., 2018] demonstrated that training a neural ...
research
11/20/2019

Information in Infinite Ensembles of Infinitely-Wide Neural Networks

In this preliminary work, we study the generalization properties of infi...

Please sign up or login with your details

Forgot password? Click here to reset