Fast Finite Width Neural Tangent Kernel

by   Roman Novak, et al.

The Neural Tangent Kernel (NTK), defined as Θ_θ^f(x_1, x_2) = [∂ f(θ, x_1)/∂θ] [∂ f(θ, x_2)/∂θ]^T where [∂ f(θ, ·)/∂θ] is a neural network (NN) Jacobian, has emerged as a central object of study in deep learning. In the infinite width limit, the NTK can sometimes be computed analytically and is useful for understanding training and generalization of NN architectures. At finite widths, the NTK is also used to better initialize NNs, compare the conditioning across models, perform architecture search, and do meta-learning. Unfortunately, the finite width NTK is notoriously expensive to compute, which severely limits its practical utility. We perform the first in-depth analysis of the compute and memory requirements for NTK computation in finite width networks. Leveraging the structure of neural networks, we further propose two novel algorithms that change the exponent of the compute and memory requirements of the finite width NTK, dramatically improving efficiency. Our algorithms can be applied in a black box fashion to any differentiable function, including those implementing neural networks. We open-source our implementations within the Neural Tangents package (arXiv:1912.02803) at



page 9

page 10

page 11

page 15

page 18

page 23

page 25

page 27


Neural Tangents: Fast and Easy Infinite Neural Networks in Python

Neural Tangents is a library designed to enable research into infinite-w...

Neural Tangent Kernel Eigenvalues Accurately Predict Generalization

Finding a quantitative theory of neural network generalization has long ...

On the infinite width limit of neural networks with a standard parameterization

There are currently two parameterizations used to derive fixed kernels c...

The edge of chaos: quantum field theory and deep neural networks

We explicitly construct the quantum field theory corresponding to a gene...

Collegial Ensembles

Modern neural network performance typically improves as model size incre...

On the Equivalence between Neural Network and Support Vector Machine

Recent research shows that the dynamics of an infinitely wide neural net...

Tensor Programs II: Neural Tangent Kernel for Any Architecture

We prove that a randomly initialized neural network of *any architecture...

Code Repositories


🌞 Profile of 𝘼𝙡𝙚𝙭𝙖𝙣𝙙𝙚𝙧 𝙍𝙤𝙜𝙖𝙡𝙨𝙠𝙞𝙮

view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.