DeepAI AI Chat
Log In Sign Up

Regularization Matters: A Nonparametric Perspective on Overparametrized Neural Network

by   Wenjia Wang, et al.

Overparametrized neural networks trained by gradient descent (GD) can provably overfit any training data. However, the generalization guarantee may not hold for noisy data. From a nonparametric perspective, this paper studies how well overparametrized neural networks can recover the true target function in the presence of random noises. We establish a lower bound on the L_2 estimation error with respect to the GD iteration, which is away from zero without a delicate choice of early stopping. In turn, through a comprehensive analysis of ℓ_2-regularized GD trajectories, we prove that for overparametrized one-hidden-layer ReLU neural network with the ℓ_2 regularization: (1) the output is close to that of the kernel ridge regression with the corresponding neural tangent kernel; (2) minimax optimal rate of L_2 estimation error is achieved. Numerical experiments confirm our theory and further demonstrate that the ℓ_2 regularization approach improves the training robustness and works for a wider range of neural networks.


Learning Lipschitz Functions by GD-trained Shallow Overparameterized ReLU Neural Networks

We explore the ability of overparameterized shallow ReLU neural networks...

Understanding Generalization of Deep Neural Networks Trained with Noisy Labels

Over-parameterized deep neural networks trained by simple first-order me...

Regularization, early-stopping and dreaming: a Hopfield-like setup to address generalization and overfitting

In this work we approach attractor neural networks from a machine learni...

Harmless Overparametrization in Two-layer Neural Networks

Overparametrized neural networks, where the number of active parameters ...

Nonparametric Regression with Shallow Overparameterized Neural Networks Trained by GD with Early Stopping

We explore the ability of overparameterized shallow neural networks to l...

Regularizing Deep Neural Networks by Noise: Its Interpretation and Optimization

Overfitting is one of the most critical challenges in deep neural networ...

Generalization Ability of Wide Neural Networks on ℝ

We perform a study on the generalization ability of the wide two-layer R...