Classifying high-dimensional Gaussian mixtures: Where kernel methods fail and neural networks succeed

02/23/2021
by   Maria Refinetti, et al.
0

A recent series of theoretical works showed that the dynamics of neural networks with a certain initialisation are well-captured by kernel methods. Concurrent empirical work demonstrated that kernel methods can come close to the performance of neural networks on some image classification tasks. These results raise the question of whether neural networks only learn successfully if kernels also learn successfully, despite neural networks being more expressive. Here, we show theoretically that two-layer neural networks (2LNN) with only a few hidden neurons can beat the performance of kernel learning on a simple Gaussian mixture classification task. We study the high-dimensional limit where the number of samples is linearly proportional to the input dimension, and show that while small 2LNN achieve near-optimal performance on this task, lazy training approaches such as random features and kernel methods do not. Our analysis is based on the derivation of a closed set of equations that track the learning dynamics of the 2LNN and thus allow to extract the asymptotic performance of the network as a function of signal-to-noise ratio and other hyperparameters. We finally illustrate how over-parametrising the neural network leads to faster convergence, but does not improve its final performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/01/2023

Optimal Learning of Deep Random Networks of Extensive-width

We consider the problem of learning a target function corresponding to a...
research
01/20/2022

Kernel Methods and Multi-layer Perceptrons Learn Linear Models in High Dimensions

Empirical observation of high dimensional phenomena, such as the double ...
research
06/19/2020

An analytic theory of shallow networks dynamics for hinge loss classification

Neural networks have been shown to perform incredibly well in classifica...
research
11/13/2021

The Three Stages of Learning Dynamics in High-Dimensional Kernel Methods

To understand how deep learning works, it is crucial to understand the t...
research
03/26/2023

Analyzing Convergence in Quantum Neural Networks: Deviations from Neural Tangent Kernels

A quantum neural network (QNN) is a parameterized mapping efficiently im...
research
07/18/2022

Fully trainable Gaussian derivative convolutional layer

The Gaussian kernel and its derivatives have already been employed for C...
research
04/15/2022

Kernel similarity matching with Hebbian neural networks

Recent works have derived neural networks with online correlation-based ...

Please sign up or login with your details

Forgot password? Click here to reset