A Priori Estimates of the Population Risk for Residual Networks

03/06/2019
by   Weinan E, et al.
6

Optimal a priori estimates are derived for the population risk of a regularized residual network model. The key lies in the designing of a new path norm, called the weighted path norm, which serves as the regularization term in the regularized model. The weighted path norm treats the skip connections and the nonlinearities differently so that paths with more nonlinearities have larger weights. The error estimates are a priori in nature in the sense that the estimates depend only on the target function and not on the parameters obtained in the training process. The estimates are optimal in the sense that the bound scales as O(1/L) with the network depth and the estimation error is comparable to the Monte Carlo error rates. In particular, optimal error bounds are obtained, for the first time, in terms of the depth of the network model. Comparisons are made with existing norm-based generalization error bounds.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/15/2018

A Priori Estimates of the Generalization Error for Two-layer Neural Networks

New estimates for the generalization error are established for the two-l...
research
03/30/2021

Nonlinear Weighted Directed Acyclic Graph and A Priori Estimates for Neural Networks

In an attempt to better understand structural benefits and generalizatio...
research
12/15/2019

On the Generalization Properties of Minimum-norm Solutions for Over-parameterized Neural Network Models

We study the generalization properties of minimum-norm solutions for thr...
research
10/05/2020

Smaller generalization error derived for deep compared to shallow residual neural networks

Estimates of the generalization error are proved for a residual neural n...
research
09/19/2018

Capacity Control of ReLU Neural Networks by Basis-path Norm

Recently, path norm was proposed as a new capacity measure for neural ne...
research
02/23/2017

Sobolev Norm Learning Rates for Regularized Least-Squares Algorithm

Learning rates for regularized least-squares algorithms are in most case...
research
01/19/2022

Stability of Deep Neural Networks via discrete rough paths

Using rough path techniques, we provide a priori estimates for the outpu...

Please sign up or login with your details

Forgot password? Click here to reset