Kernel Regression with Infinite-Width Neural Networks on Millions of Examples

03/09/2023
by   Ben Adlam, et al.
0

Neural kernels have drastically increased performance on diverse and nonstandard data modalities but require significantly more compute, which previously limited their application to smaller datasets. In this work, we address this by massively parallelizing their computation across many GPUs. We combine this with a distributed, preconditioned conjugate gradients algorithm to enable kernel regression at a large scale (i.e. up to five million examples). Using this approach, we study scaling laws of several neural kernels across many orders of magnitude for the CIFAR-5m dataset. Using data augmentation to expand the original CIFAR-10 training dataset by a factor of 20, we obtain a test accuracy of 91.2% (SotA for a pure kernel method). Moreover, we explore neural kernels on other data modalities, obtaining results on protein and small molecule prediction tasks that are competitive with SotA methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/26/2023

A Simple Algorithm For Scaling Up Kernel Methods

The recent discovery of the equivalence between infinitely wide neural n...
research
10/30/2020

Dataset Meta-Learning from Kernel Ridge-Regression

One of the most fundamental aspects of any machine learning algorithm is...
research
06/25/2022

A Fast, Well-Founded Approximation to the Empirical Neural Tangent Kernel

Empirical neural tangent kernels (eNTKs) can provide a good understandin...
research
06/15/2021

Scaling Neural Tangent Kernels via Sketching and Random Features

The Neural Tangent Kernel (NTK) characterizes the behavior of infinitely...
research
07/31/2020

Finite Versus Infinite Neural Networks: an Empirical Study

We perform a careful, thorough, and large scale empirical study of the c...
research
10/21/2022

Efficient Dataset Distillation Using Random Feature Approximation

Dataset distillation compresses large datasets into smaller synthetic co...
research
04/26/2019

On Exact Computation with an Infinitely Wide Neural Net

How well does a classic deep net architecture like AlexNet or VGG19 clas...

Please sign up or login with your details

Forgot password? Click here to reset