A priori generalization error for two-layer ReLU neural network through minimum norm solution

12/06/2019
by   Zhi-Qin John Xu, et al.
0

We focus on estimating a priori generalization error of two-layer ReLU neural networks (NNs) trained by mean squared error, which only depends on initial parameters and the target function, through the following research line. We first estimate a priori generalization error of finite-width two-layer ReLU NN with constraint of minimal norm solution, which is proved by <cit.> to be an equivalent solution of a linearized (w.r.t. parameter) finite-width two-layer NN. As the width goes to infinity, the linearized NN converges to the NN in Neural Tangent Kernel (NTK) regime <cit.>. Thus, we can derive the a priori generalization error of two-layer ReLU NN in NTK regime. The distance between NN in a NTK regime and a finite-width NN with gradient training is estimated by <cit.>. Based on the results in <cit.>, our work proves an a priori generalization error bound of two-layer ReLU NNs. This estimate uses the intrinsic implicit bias of the minimum norm solution without requiring extra regularity in the loss function. This a priori estimate also implies that NN does not suffer from curse of dimensionality, and a small generalization error can be achieved without requiring exponentially large number of neurons. In addition the research line proposed in this paper can also be used to study other properties of the finite-width network, such as the posterior generalization error.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/29/2023

Minimum Width of Leaky-ReLU Neural Networks for Uniform Universal Approximation

The study of universal approximation properties (UAP) for neural network...
research
12/23/2022

The Onset of Variance-Limited Behavior for Networks in the Lazy and Rich Regimes

For small training set sizes P, the generalization error of wide neural ...
research
06/11/2021

Neural Optimization Kernel: Towards Robust Deep Learning

Recent studies show a close connection between neural networks (NN) and ...
research
11/11/2021

On the Equivalence between Neural Network and Support Vector Machine

Recent research shows that the dynamics of an infinitely wide neural net...
research
10/07/2021

Tighter Sparse Approximation Bounds for ReLU Neural Networks

A well-known line of work (Barron, 1993; Breiman, 1993; Klusowski Ba...
research
06/16/2022

Neural tangent kernel analysis of shallow α-Stable ReLU neural networks

There is a recent literature on large-width properties of Gaussian neura...
research
05/04/2021

A Priori Generalization Error Analysis of Two-Layer Neural Networks for Solving High Dimensional Schrödinger Eigenvalue Problems

This paper analyzes the generalization error of two-layer neural network...

Please sign up or login with your details

Forgot password? Click here to reset