Quantifying the generalization error in deep learning in terms of data distribution and neural network smoothness

05/27/2019
by   Pengzhan Jin, et al.
31

The accuracy of deep learning, i.e., deep neural networks, can be characterized by dividing the total error into three main types: approximation error, optimization error, and generalization error. Whereas there are some satisfactory answers to the problems of approximation and optimization, much less is known about the theory of generalization. Most existing theoretical works for generalization fail to explain the performance of neural networks in practice. To derive a meaningful bound, we study the generalization error of neural networks for classification problems in terms of data distribution and neural network smoothness. We introduce the cover complexity (CC) to measure the difficulty of learning a data set and the inverse of modules of continuity to quantify neural network smoothness. A quantitative bound for expected accuracy/error is derived by considering both the CC and neural network smoothness. We validate our theoretical results by several data sets of images. The numerical results verify that the expected error of trained networks scaled with the square root of the number of classes has a linear relationship with respect to the CC. In addition, we observe a clear consistency between test loss and neural network smoothness during the training process.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/15/2022

Error analysis for deep neural network approximations of parametric hyperbolic conservation laws

We derive rigorous bounds on the error resulting from the approximation ...
research
04/24/2018

An Information-Theoretic View for Deep Learning

Deep learning has transformed the computer vision, natural language proc...
research
07/25/2023

Modify Training Directions in Function Space to Reduce Generalization Error

We propose theoretical analyses of a modified natural gradient descent m...
research
09/21/2021

Comparison of Neural Network based Soft Computing Techniques for Electromagnetic Modeling of a Microstrip Patch Antenna

This paper presents the comparison of various neural networks and algori...
research
05/27/2022

Why Robust Generalization in Deep Learning is Difficult: Perspective of Expressive Power

It is well-known that modern neural networks are vulnerable to adversari...
research
09/17/2020

Distributional Generalization: A New Kind of Generalization

We introduce a new notion of generalization – Distributional Generalizat...
research
05/31/2023

Optimal Estimates for Pairwise Learning with Deep ReLU Networks

Pairwise learning refers to learning tasks where a loss takes a pair of ...

Please sign up or login with your details

Forgot password? Click here to reset