Analyzing Finite Neural Networks: Can We Trust Neural Tangent Kernel Theory?

12/08/2020
by   Mariia Seleznova, et al.
8

Neural Tangent Kernel (NTK) theory is widely used to study the dynamics of infinitely-wide deep neural networks (DNNs) under gradient descent. But do the results for infinitely-wide networks give us hints about the behaviour of real finite-width ones? In this paper we study empirically when NTK theory is valid in practice for fully-connected ReLu and sigmoid networks. We find out that whether a network is in the NTK regime depends on the hyperparameters of random initialization and network's depth. In particular, NTK theory does not explain behaviour of sufficiently deep networks initialized so that their gradients explode: the kernel is random at initialization and changes significantly during training, contrary to NTK theory. On the other hand, in case of vanishing gradients DNNs are in the NTK regime but become untrainable rapidly with depth. We also describe a framework to study generalization properties of DNNs by means of NTK theory and discuss its limits.

READ FULL TEXT

page 5

page 7

page 8

page 9

research
05/30/2019

Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks

We study the training and generalization of deep neural networks (DNNs) ...
research
06/12/2019

Learning Curves for Deep Neural Networks: A Gaussian Field Theory Perspective

A series of recent works suggest that deep neural networks (DNNs), of fi...
research
01/11/2018

Which Neural Net Architectures Give Rise To Exploding and Vanishing Gradients?

We give a rigorous analysis of the statistical behavior of gradients in ...
research
02/07/2022

Neural Tangent Kernel Analysis of Deep Narrow Neural Networks

The tremendous recent progress in analyzing the training dynamics of ove...
research
07/11/2019

Freeze and Chaos for DNNs: an NTK view of Batch Normalization, Checkerboard and Boundary Effects

In this paper, we analyze a number of architectural features of Deep Neu...
research
06/24/2020

On the Empirical Neural Tangent Kernel of Standard Finite-Width Convolutional Neural Network Architectures

The Neural Tangent Kernel (NTK) is an important milestone in the ongoing...
research
01/28/2022

Interplay between depth of neural networks and locality of target functions

It has been recognized that heavily overparameterized deep neural networ...

Please sign up or login with your details

Forgot password? Click here to reset