Kernel-Based Smoothness Analysis of Residual Networks

09/21/2020
by   Tom Tirer, et al.
21

A major factor in the success of deep neural networks is the use of sophisticated architectures rather than the classical multilayer perceptron (MLP). Residual networks (ResNets) stand out among these powerful modern architectures. Previous works focused on the optimization advantages of deep ResNets over deep MLPs. In this paper, we show another distinction between the two models, namely, a tendency of ResNets to promote smoother interpolations than MLPs. We analyze this phenomenon via the neural tangent kernel (NTK) approach. First, we compute the NTK for a considered ResNet model and prove its stability during gradient descent training. Then, we show by various evaluation methodologies that the NTK of ResNet, and its kernel regression results, are smoother than the ones of MLP. The better smoothness observed in our analysis may explain the better generalization ability of ResNets and the practice of moderately attenuating the residual blocks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/04/2023

Statistical Optimality of Deep Wide Neural Networks

In this paper, we consider the generalization ability of deep wide feedf...
research
05/29/2023

Generalization Ability of Wide Residual Networks

In this paper, we study the generalization ability of the wide residual ...
research
09/27/2018

Smooth Inter-layer Propagation of Stabilized Neural Networks for Classification

Recent work has studied the reasons for the remarkable performance of de...
research
04/02/2019

Why ResNet Works? Residuals Generalize

Residual connections significantly boost the performance of deep neural ...
research
02/14/2020

Why Do Deep Residual Networks Generalize Better than Deep Feedforward Networks? – A Neural Tangent Kernel Perspective

Deep residual networks (ResNets) have demonstrated better generalization...
research
01/07/2020

Kinetic Theory for Residual Neural Networks

Deep residual neural networks (ResNet) are performing very well for many...
research
12/22/2016

Highway and Residual Networks learn Unrolled Iterative Estimation

The past year saw the introduction of new architectures such as Highway ...

Please sign up or login with your details

Forgot password? Click here to reset