
Predicting the outputs of finite networks trained with noisy gradients
A recent line of studies has focused on the infinite width limit of deep...
read it

Learning Curves for Deep Neural Networks: A Gaussian Field Theory Perspective
A series of recent works suggest that deep neural networks (DNNs), of fi...
read it

Feature Learning in InfiniteWidth Neural Networks
As its width tends to infinity, a deep neural network's behavior under g...
read it

Doubledescent curves in neural networks: a new perspective using Gaussian processes
Doubledescent curves in neural networks describe the phenomenon that th...
read it

Finite size corrections for neural network Gaussian processes
There has been a recent surge of interest in modeling neural networks (N...
read it

Implicit Acceleration and Feature Learning in Infinitely Wide Neural Networks with Bottlenecks
We analyze the learning dynamics of infinitely wide neural networks with...
read it

DeepKriging: Spatially Dependent Deep Neural Networks for Spatial Prediction
In spatial statistics, a common objective is to predict the values of a ...
read it
A self consistent theory of Gaussian Processes captures feature learning effects in finite CNNs
Deep neural networks (DNNs) in the infinite width/channel limit have received much attention recently, as they provide a clear analytical window to deep learning via mappings to Gaussian Processes (GPs). Despite its theoretical appeal, this viewpoint lacks a crucial ingredient of deep learning in finite DNNs, laying at the heart of their success – feature learning. Here we consider DNNs trained with noisy gradient descent on a large training set and derive a self consistent Gaussian Process theory accounting for strong finiteDNN and feature learning effects. Applying this to a toy model of a twolayer linear convolutional neural network (CNN) shows good agreement with experiments. We further identify, both analytical and numerically, a sharp transition between a feature learning regime and a lazy learning regime in this model. Strong finiteDNN effects are also derived for a nonlinear twolayer fully connected network. Our self consistent theory provides a rich and versatile analytical framework for studying feature learning and other nonlazy effects in finite DNNs.
READ FULL TEXT
Comments
There are no comments yet.