Benign Overfitting in Two-layer Convolutional Neural Networks

02/14/2022
by   Yuan Cao, et al.
0

Modern neural networks often have great expressive power and can be trained to overfit the training data, while still achieving a good test performance. This phenomenon is referred to as "benign overfitting". Recently, there emerges a line of works studying "benign overfitting" from the theoretical perspective. However, they are limited to linear models or kernel/random feature models, and there is still a lack of theoretical understanding about when and how benign overfitting occurs in neural networks. In this paper, we study the benign overfitting phenomenon in training a two-layer convolutional neural network (CNN). We show that when the signal-to-noise ratio satisfies a certain condition, a two-layer CNN trained by gradient descent can achieve arbitrarily small training and test loss. On the other hand, when this condition does not hold, overfitting becomes harmful and the obtained CNN can only achieve constant level test loss. These together demonstrate a sharp phase transition between benign overfitting and harmful overfitting, driven by the signal-to-noise ratio. To the best of our knowledge, this is the first work that precisely characterizes the conditions under which benign overfitting can occur in training convolutional neural networks.

READ FULL TEXT
research
03/07/2023

Benign Overfitting for Two-layer ReLU Networks

Modern deep learning models with great expressive power can be trained t...
research
06/20/2019

Homogeneous Vector Capsules Enable Adaptive Gradient Descent in Convolutional Neural Networks

Capsules are the name given by Geoffrey Hinton to vector-valued neurons....
research
02/24/2020

Baryon acoustic oscillations reconstruction using convolutional neural networks

Here we propose a new scheme to reconstruct the baryon acoustic oscillat...
research
11/18/2020

Benign Overfitting in Binary Classification of Gaussian Mixtures

Deep neural networks generalize well despite being exceedingly overparam...
research
09/27/2022

Measuring Overfitting in Convolutional Neural Networks using Adversarial Perturbations and Label Noise

Although numerous methods to reduce the overfitting of convolutional neu...
research
04/13/2021

The Impact of Activation Sparsity on Overfitting in Convolutional Neural Networks

Overfitting is one of the fundamental challenges when training convoluti...
research
05/24/2023

From Tempered to Benign Overfitting in ReLU Neural Networks

Overparameterized neural networks (NNs) are observed to generalize well ...

Please sign up or login with your details

Forgot password? Click here to reset