DeepAI
Log In Sign Up

Why Do Deep Residual Networks Generalize Better than Deep Feedforward Networks? – A Neural Tangent Kernel Perspective

02/14/2020
by   Kaixuan Huang, et al.
9

Deep residual networks (ResNets) have demonstrated better generalization performance than deep feedforward networks (FFNets). However, the theory behind such a phenomenon is still largely unknown. This paper studies this fundamental problem in deep learning from a so-called "neural tangent kernel" perspective. Specifically, we first show that under proper conditions, as the width goes to infinity, training deep ResNets can be viewed as learning reproducing kernel functions with some kernel function. We then compare the kernel of deep ResNets with that of deep FFNets and discover that the class of functions induced by the kernel of FFNets is asymptotically not learnable, as the depth goes to infinity. In contrast, the class of functions induced by the kernel of ResNets does not exhibit such degeneracy. Our discovery partially justifies the advantages of deep ResNets over deep FFNets in generalization abilities. Numerical results are provided to support our claim.

READ FULL TEXT

page 1

page 2

page 3

page 4

08/29/2022

Neural Tangent Kernel: A Survey

A seminal work [Jacot et al., 2018] demonstrated that training a neural ...
09/21/2020

Kernel-Based Smoothness Analysis of Residual Networks

A major factor in the success of deep neural networks is the use of soph...
01/28/2020

Residual Tangent Kernels

A recent body of work has focused on the theoretical study of neural net...
03/09/2021

On the Generalization Power of Overfitted Two-Layer Neural Tangent Kernel Models

In this paper, we study the generalization performance of min ℓ_2-norm o...
09/22/2022

Vanilla feedforward neural networks as a discretization of dynamic systems

Deep learning has made significant applications in the field of data sci...
10/08/2021

Neural Tangent Kernel Eigenvalues Accurately Predict Generalization

Finding a quantitative theory of neural network generalization has long ...
10/17/2019

Why bigger is not always better: on finite and infinite neural networks

Recent work has shown that the outputs of convolutional neural networks ...