Quantifying the Benefit of Using Differentiable Learning over Tangent Kernels

03/01/2021
by   Eran Malach, et al.
0

We study the relative power of learning with gradient descent on differentiable models, such as neural networks, versus using the corresponding tangent kernels. We show that under certain conditions, gradient descent achieves small error only if a related tangent kernel method achieves a non-trivial advantage over random guessing (a.k.a. weak learning), though this advantage might be very small even when gradient descent can achieve arbitrarily high accuracy. Complementing this, we show that without these conditions, gradient descent can in fact learn with small error even when no kernel method, in particular using the tangent kernel, can achieve a non-trivial advantage over random guessing.

READ FULL TEXT

Authors

page 1

page 2

page 3

page 4

09/26/2019

Polylogarithmic width suffices for gradient descent to achieve arbitrarily small test error with shallow ReLU networks

Recent work has revealed that overparameterized networks trained by grad...
12/13/2018

Tight Analyses for Non-Smooth Stochastic Gradient Descent

Consider the problem of minimizing functions that are Lipschitz and stro...
12/03/2017

Gradient Descent Learns One-hidden-layer CNN: Don't be Afraid of Spurious Local Minima

We consider the problem of learning a one-hidden-layer neural network wi...
02/22/2021

Kernel quadrature by applying a point-wise gradient descent method to discrete energies

We propose a method for generating nodes for kernel quadrature by a poin...
03/30/2017

Diving into the shallows: a computational perspective on large-scale shallow learning

In this paper we first identify a basic limitation in gradient descent-b...
11/30/2020

Every Model Learned by Gradient Descent Is Approximately a Kernel Machine

Deep learning's successes are often attributed to its ability to automat...
04/01/2019

Benchmarking Approximate Inference Methods for Neural Structured Prediction

Exact structured inference with neural network scoring functions is comp...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.