Towards a General Theory of Infinite-Width Limits of Neural Classifiers

03/12/2020
by   Eugene. A. Golikov, et al.
0

Obtaining theoretical guarantees for neural networks training appears to be a hard problem in a general case. Recent research has been focused on studying this problem in the limit of infinite width and two different theories have been developed: mean-field (MF) and kernel limit theories. We propose a general framework that provides a link between these seemingly distinct theories. Our framework out of the box gives rise to a discrete-time MF limit which was not previously explored in the literature. We prove a convergence theorem for it and show that it provides a more reasonable approximation for finite-width nets compared to NTK limit if learning rates are not very small. Also, our analysis suggests that all infinite-width limits of a network with a single hidden layer are covered by either mean-field limit theory or kernel limit theory. We show that for networks with more than two hidden layers RMSProp training has a non-trivial MF limit, but GD training does not have one. Overall, our framework demonstrates that both MF and NTK limits have considerable limitations in approximating finite-sized neural nets, indicating the need for designing more accurate infinite-width approximations for them. Source code to reproduce all the reported results is available on GitHub.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro