Double-descent curves in neural networks: a new perspective using Gaussian processes

02/14/2021
by   Ouns El Harzli, et al.
0

Double-descent curves in neural networks describe the phenomenon that the generalisation error initially descends with increasing parameters, then grows after reaching an optimal number of parameters which is less than the number of data points, but then descends again in the overparameterised regime. Here we use a neural network Gaussian process (NNGP) which maps exactly to a fully connected network (FCN) in the infinite width limit, combined with techniques from random matrix theory, to calculate this generalisation behaviour, with a particular focus on the overparameterised regime. We verify our predictions with numerical simulations of the corresponding Gaussian process regressions. An advantage of our NNGP approach is that the analytical calculations are easier to interpret. We argue that neural network generalization performance improves in the overparameterised regime precisely because that is where they converge to their equivalent Gaussian process.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 32

page 33

page 34

07/04/2021

Random Neural Networks in the Infinite Width Limit as Gaussian Processes

This article gives a new proof that fully connected neural networks with...
01/07/2021

Infinitely Wide Tensor Networks as Gaussian Process

Gaussian Process is a non-parametric prior which can be understood as a ...
08/27/2019

Finite size corrections for neural network Gaussian processes

There has been a recent surge of interest in modeling neural networks (N...
02/27/2012

Replica theory for learning curves for Gaussian processes on random graphs

Statistical physics approaches can be used to derive accurate prediction...
06/08/2021

A self consistent theory of Gaussian Processes captures feature learning effects in finite CNNs

Deep neural networks (DNNs) in the infinite width/channel limit have rec...
10/23/2021

Learning curves for Gaussian process regression with power-law priors and targets

We study the power-law asymptotics of learning curves for Gaussian proce...
08/19/2020

Neural Networks and Quantum Field Theory

We propose a theoretical understanding of neural networks in terms of Wi...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.