Towards a General Theory of Infinite-Width Limits of Neural Classifiers

03/12/2020
by   Eugene. A. Golikov, et al.
0

Obtaining theoretical guarantees for neural networks training appears to be a hard problem in a general case. Recent research has been focused on studying this problem in the limit of infinite width and two different theories have been developed: mean-field (MF) and kernel limit theories. We propose a general framework that provides a link between these seemingly distinct theories. Our framework out of the box gives rise to a discrete-time MF limit which was not previously explored in the literature. We prove a convergence theorem for it and show that it provides a more reasonable approximation for finite-width nets compared to NTK limit if learning rates are not very small. Also, our analysis suggests that all infinite-width limits of a network with a single hidden layer are covered by either mean-field limit theory or kernel limit theory. We show that for networks with more than two hidden layers RMSProp training has a non-trivial MF limit, but GD training does not have one. Overall, our framework demonstrates that both MF and NTK limits have considerable limitations in approximating finite-sized neural nets, indicating the need for designing more accurate infinite-width approximations for them. Source code to reproduce all the reported results is available on GitHub.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/11/2020

Dynamically Stable Infinite-Width Limits of Neural Classifiers

Recent research has been focused on two different approaches to studying...
research
10/28/2022

A Functional-Space Mean-Field Theory of Partially-Trained Three-Layer Neural Networks

To understand the training dynamics of neural networks (NNs), prior stud...
research
12/10/2021

Global convergence of ResNets: From finite to infinite width using linear parameterization

Overparametrization is a key factor in the absence of convexity to expla...
research
10/29/2021

Limiting fluctuation and trajectorial stability of multilayer neural networks with mean field training

The mean field (MF) theory of multilayer neural networks centers around ...
research
10/10/2022

Meta-Principled Family of Hyperparameter Scaling Strategies

In this note, we first derive a one-parameter family of hyperparameter s...
research
12/10/2021

Unified Field Theory for Deep and Recurrent Neural Networks

Understanding capabilities and limitations of different network architec...

Please sign up or login with your details

Forgot password? Click here to reset