
Dynamical meanfield theory for stochastic gradient descent in Gaussian mixture classification
We analyze in a closed form the learning dynamics of stochastic gradient...
read it

On the mean field limit of Random Batch Method for interacting particle systems
The Random Batch Method proposed in our previous work [Jin et al., J. Co...
read it

SGD Distributional Dynamics of Three Layer Neural Networks
With the rise of big data analytics, multilayer neural networks have su...
read it

Mean Field Limit of the Learning Dynamics of Multilayer Neural Networks
Can multilayer neural networks  typically constructed as highly comple...
read it

Mean Field Analysis of Deep Neural Networks
We analyze multilayer neural networks in the asymptotic regime of simul...
read it

A meanfield theory of lazy training in twolayer neural nets: entropic regularization and controlled McKeanVlasov dynamics
We consider the problem of universal approximation of functions by twol...
read it

Analysis of feature learning in weighttied autoencoders via the mean field lens
Autoencoders are among the earliest introduced nonlinear models for unsu...
read it
Quantitative Propagation of Chaos for SGD in Wide Neural Networks
In this paper, we investigate the limiting behavior of a continuoustime counterpart of the Stochastic Gradient Descent (SGD) algorithm applied to twolayer overparameterized neural networks, as the number or neurons (ie, the size of the hidden layer) N → +∞. Following a probabilistic approach, we show 'propagation of chaos' for the particle system defined by this continuoustime dynamics under different scenarios, indicating that the statistical interaction between the particles asymptotically vanishes. In particular, we establish quantitative convergence with respect to N of any particle to a solution of a meanfield McKeanVlasov equation in the metric space endowed with the Wasserstein distance. In comparison to previous works on the subject, we consider settings in which the sequence of stepsizes in SGD can potentially depend on the number of neurons and the iterations. We then identify two regimes under which different meanfield limits are obtained, one of them corresponding to an implicitly regularized version of the minimization problem at hand. We perform various experiments on real datasets to validate our theoretical results, assessing the existence of these two regimes on classification problems and illustrating our convergence results.
READ FULL TEXT
Comments
There are no comments yet.