Double-descent curves in neural networks: a new perspective using Gaussian processes

02/14/2021
by   Ouns El Harzli, et al.
0

Double-descent curves in neural networks describe the phenomenon that the generalisation error initially descends with increasing parameters, then grows after reaching an optimal number of parameters which is less than the number of data points, but then descends again in the overparameterised regime. Here we use a neural network Gaussian process (NNGP) which maps exactly to a fully connected network (FCN) in the infinite width limit, combined with techniques from random matrix theory, to calculate this generalisation behaviour, with a particular focus on the overparameterised regime. We verify our predictions with numerical simulations of the corresponding Gaussian process regressions. An advantage of our NNGP approach is that the analytical calculations are easier to interpret. We argue that neural network generalization performance improves in the overparameterised regime precisely because that is where they converge to their equivalent Gaussian process.

READ FULL TEXT

page 32

page 33

page 34

research
07/04/2021

Random Neural Networks in the Infinite Width Limit as Gaussian Processes

This article gives a new proof that fully connected neural networks with...
research
01/07/2021

Infinitely Wide Tensor Networks as Gaussian Process

Gaussian Process is a non-parametric prior which can be understood as a ...
research
05/17/2022

Deep neural networks with dependent weights: Gaussian Process mixture limit, heavy tails, sparsity and compressibility

This article studies the infinite-width limit of deep feedforward neural...
research
08/27/2019

Finite size corrections for neural network Gaussian processes

There has been a recent surge of interest in modeling neural networks (N...
research
06/12/2019

Learning Curves for Deep Neural Networks: A Gaussian Field Theory Perspective

A series of recent works suggest that deep neural networks (DNNs), of fi...
research
03/14/2022

Quantitative Gaussian Approximation of Randomly Initialized Deep Neural Networks

Given any deep fully connected neural network, initialized with random G...
research
06/08/2021

A self consistent theory of Gaussian Processes captures feature learning effects in finite CNNs

Deep neural networks (DNNs) in the infinite width/channel limit have rec...

Please sign up or login with your details

Forgot password? Click here to reset