Properties of the After Kernel

05/21/2021
by   Philip M. Long, et al.
0

The Neural Tangent Kernel (NTK) is the wide-network limit of a kernel defined using neural networks at initialization, whose embedding is the gradient of the output of the network with respect to its parameters. We study the "after kernel", which is defined using the same embedding, except after training, for neural networks with standard architectures, on binary classification problems extracted from MNIST and CIFAR-10, trained using SGD in a standard way. For some dataset-architecture pairs, after a few epochs of neural network training, a hard-margin SVM using the network's after kernel is much more accurate than when the network's initial kernel is used. For networks with an architecture similar to VGG, the after kernel is more "global", in the sense that it is less invariant to transformations of input images that disrupt the global structure of the image while leaving the local statistics largely intact. For fully connected networks, the after kernel is less global in this sense. The after kernel tends to be more invariant to small shifts, rotations and zooms; data augmentation does not improve these invariances. The (finite approximation to the) conjugate kernel, obtained using the last layer of hidden nodes, sometimes, but not always, provides a good approximation to the NTK and the after kernel.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/03/2019

Enhanced Convolutional Neural Tangent Kernels

Recent research shows that for training with ℓ_2 loss, convolutional neu...
research
10/19/2018

Exchangeability and Kernel Invariance in Trained MLPs

In the analysis of machine learning models, it is often convenient to as...
research
04/13/2021

Gradient Kernel Regression

In this article a surprising result is demonstrated using the neural tan...
research
02/25/2021

Learning with invariances in random features and kernel models

A number of machine learning tasks entail a high degree of invariance: t...
research
12/30/2019

Disentangling trainability and generalization in deep learning

A fundamental goal in deep learning is the characterization of trainabil...
research
03/07/2023

On the Implicit Bias of Linear Equivariant Steerable Networks: Margin, Generalization, and Their Equivalence to Data Augmentation

We study the implicit bias of gradient flow on linear equivariant steera...
research
07/22/2020

Compressing invariant manifolds in neural nets

We study how neural networks compress uninformative input space in model...

Please sign up or login with your details

Forgot password? Click here to reset