Generalization Ability of Wide Residual Networks

05/29/2023
βˆ™
by   Jianfa Lai, et al.
βˆ™
0
βˆ™

In this paper, we study the generalization ability of the wide residual network on π•Š^d-1 with the ReLU activation function. We first show that as the width mβ†’βˆž, the residual network kernel (RNK) uniformly converges to the residual neural tangent kernel (RNTK). This uniform convergence further guarantees that the generalization error of the residual network converges to that of the kernel regression with respect to the RNTK. As direct corollaries, we then show i) the wide residual network with the early stopping strategy can achieve the minimax rate provided that the target regression function falls in the reproducing kernel Hilbert space (RKHS) associated with the RNTK; ii) the wide residual network can not generalize well if it is trained till overfitting the data. We finally illustrate some experiments to reconcile the contradiction between our theoretical result and the widely observed β€œbenign overfitting phenomenon”

READ FULL TEXT

page 1

page 2

page 3

page 4

research
βˆ™ 02/12/2023

Generalization Ability of Wide Neural Networks on ℝ

We perform a study on the generalization ability of the wide two-layer R...
research
βˆ™ 05/04/2023

Statistical Optimality of Deep Wide Neural Networks

In this paper, we consider the generalization ability of deep wide feedf...
research
βˆ™ 09/21/2020

Kernel-Based Smoothness Analysis of Residual Networks

A major factor in the success of deep neural networks is the use of soph...
research
βˆ™ 02/14/2020

Why Do Deep Residual Networks Generalize Better than Deep Feedforward Networks? – A Neural Tangent Kernel Perspective

Deep residual networks (ResNets) have demonstrated better generalization...
research
βˆ™ 11/04/2020

Kernel Dependence Network

We propose a greedy strategy to spectrally train a deep network for mult...
research
βˆ™ 09/26/2021

Frequency Disentangled Residual Network

Residual networks (ResNets) have been utilized for various computer visi...
research
βˆ™ 05/23/2023

Mind the spikes: Benign overfitting of kernels and neural networks in fixed dimension

The success of over-parameterized neural networks trained to near-zero t...

Please sign up or login with your details

Forgot password? Click here to reset