
Can Shallow Neural Networks Beat the Curse of Dimensionality? A mean field training perspective
We prove that the gradient descent training of a twolayer neural networ...
read it

Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime
We analyze the convergence of the averaged stochastic gradient descent f...
read it

Ergodicity of the underdamped meanfield Langevin dynamics
We study the long time behavior of an underdamped meanfield Langevin (M...
read it

Global Convergence of Sobolev Training for Overparametrized Neural Networks
Sobolev loss is used when training a network to approximate the values a...
read it

The QuenchingActivation Behavior of the Gradient Descent Dynamics for Twolayer Neural Network Models
A numerical and phenomenological study of the gradient descent (GD) algo...
read it

A priori generalization error for twolayer ReLU neural network through minimum norm solution
We focus on estimating a priori generalization error of twolayer ReLU n...
read it

A Note on the Global Convergence of Multilayer Neural Networks in the Mean Field Regime
In a recent work, we introduced a rigorous framework to describe the mea...
read it
On the Convergence of Gradient Descent Training for Twolayer ReLUnetworks in the Mean Field Regime
We describe a necessary and sufficient condition for the convergence to minimum Bayes risk when training twolayer ReLUnetworks by gradient descent in the mean field regime with omnidirectional initial parameter distribution. This article extends recent results of Chizat and Bach to ReLUactivated networks and to the situation in which there are no parameters which exactly achieve MBR. The condition does not depend on the initalization of parameters and concerns only the weak convergence of the realization of the neural network, not its parameter distribution.
READ FULL TEXT
Comments
There are no comments yet.