Why Do Deep Residual Networks Generalize Better than Deep Feedforward Networks? – A Neural Tangent Kernel Perspective

02/14/2020
by   Kaixuan Huang, et al.
9

Deep residual networks (ResNets) have demonstrated better generalization performance than deep feedforward networks (FFNets). However, the theory behind such a phenomenon is still largely unknown. This paper studies this fundamental problem in deep learning from a so-called "neural tangent kernel" perspective. Specifically, we first show that under proper conditions, as the width goes to infinity, training deep ResNets can be viewed as learning reproducing kernel functions with some kernel function. We then compare the kernel of deep ResNets with that of deep FFNets and discover that the class of functions induced by the kernel of FFNets is asymptotically not learnable, as the depth goes to infinity. In contrast, the class of functions induced by the kernel of ResNets does not exhibit such degeneracy. Our discovery partially justifies the advantages of deep ResNets over deep FFNets in generalization abilities. Numerical results are provided to support our claim.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/29/2022

Neural Tangent Kernel: A Survey

A seminal work [Jacot et al., 2018] demonstrated that training a neural ...
research
05/29/2023

Generalization Ability of Wide Residual Networks

In this paper, we study the generalization ability of the wide residual ...
research
09/21/2020

Kernel-Based Smoothness Analysis of Residual Networks

A major factor in the success of deep neural networks is the use of soph...
research
05/04/2023

Statistical Optimality of Deep Wide Neural Networks

In this paper, we consider the generalization ability of deep wide feedf...
research
03/09/2021

On the Generalization Power of Overfitted Two-Layer Neural Tangent Kernel Models

In this paper, we study the generalization performance of min ℓ_2-norm o...
research
09/22/2022

Vanilla feedforward neural networks as a discretization of dynamic systems

Deep learning has made significant applications in the field of data sci...
research
10/17/2019

Why bigger is not always better: on finite and infinite neural networks

Recent work has shown that the outputs of convolutional neural networks ...

Please sign up or login with your details

Forgot password? Click here to reset