Learning Curves for Deep Neural Networks: A Gaussian Field Theory Perspective

06/12/2019
by   Omry Cohen, et al.
0

A series of recent works suggest that deep neural networks (DNNs), of fixed depth, are equivalent to certain Gaussian Processes (NNGP/NTK) in the highly over-parameterized regime (width or number-of-channels going to infinity). Other works suggest that this limit is relevant for real-world DNNs. These results invite further study into the generalization properties of Gaussian Processes of the NNGP and NTK type. Here we make several contributions along this line. First, we develop a formalism, based on field theory tools, for calculating learning curves perturbatively in one over the dataset size. For the case of NNGPs, this formalism naturally extends to finite width corrections. Second, in cases where one can diagonalize the covariance-function of the NNGP/NTK, we provide analytic expressions for the asymptotic learning curves of any given target function. These go beyond the standard equivalence kernel results. Last, we provide closed analytic expressions for the eigenvalues of NNGP/NTK kernels of depth 2 fully-connected ReLU networks. For datasets on the hypersphere, the eigenfunctions of such kernels, at any depth, are hyperspherical harmonics. A simple coherent picture emerges wherein fully-connected DNNs have a strong entropic bias towards functions which are low order polynomials of the input.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/08/2020

Analyzing Finite Neural Networks: Can We Trust Neural Tangent Kernel Theory?

Neural Tangent Kernel (NTK) theory is widely used to study the dynamics ...
research
07/04/2021

Random Neural Networks in the Infinite Width Limit as Gaussian Processes

This article gives a new proof that fully connected neural networks with...
research
06/08/2021

A self consistent theory of Gaussian Processes captures feature learning effects in finite CNNs

Deep neural networks (DNNs) in the infinite width/channel limit have rec...
research
01/28/2022

Interplay between depth of neural networks and locality of target functions

It has been recognized that heavily overparameterized deep neural networ...
research
02/14/2021

Double-descent curves in neural networks: a new perspective using Gaussian processes

Double-descent curves in neural networks describe the phenomenon that th...
research
07/29/2021

Deep Networks Provably Classify Data on Curves

Data with low-dimensional nonlinear structure are ubiquitous in engineer...
research
07/18/2023

Optimistic Estimate Uncovers the Potential of Nonlinear Models

We propose an optimistic estimate to evaluate the best possible fitting ...

Please sign up or login with your details

Forgot password? Click here to reset