
Understanding Generalization of Deep Neural Networks Trained with Noisy Labels
Overparameterized deep neural networks trained by simple firstorder me...
read it

The Efficacy of L_1 Regularization in TwoLayer Neural Networks
A crucial problem in neural networks is to select the most appropriate n...
read it

Statistical Guarantees for Regularized Neural Networks
Neural networks have become standard tools in the analysis of data, but ...
read it

Regularizing Deep Neural Networks by Noise: Its Interpretation and Optimization
Overfitting is one of the most critical challenges in deep neural networ...
read it

Regularized Kernel and Neural Sobolev Descent: Dynamic MMD Transport
We introduce Regularized Kernel and Neural Sobolev Descent for transport...
read it

A Revision of Neural Tangent Kernelbased Approaches for Neural Networks
Recent theoretical works based on the neural tangent kernel (NTK) have s...
read it

Nonparametric regression using needlet kernels for spherical data
Needlets have been recognized as stateoftheart tools to tackle spheri...
read it
Regularization Matters: A Nonparametric Perspective on Overparametrized Neural Network
Overparametrized neural networks trained by gradient descent (GD) can provably overfit any training data. However, the generalization guarantee may not hold for noisy data. From a nonparametric perspective, this paper studies how well overparametrized neural networks can recover the true target function in the presence of random noises. We establish a lower bound on the L_2 estimation error with respect to the GD iteration, which is away from zero without a delicate choice of early stopping. In turn, through a comprehensive analysis of ℓ_2regularized GD trajectories, we prove that for overparametrized onehiddenlayer ReLU neural network with the ℓ_2 regularization: (1) the output is close to that of the kernel ridge regression with the corresponding neural tangent kernel; (2) minimax optimal rate of L_2 estimation error is achieved. Numerical experiments confirm our theory and further demonstrate that the ℓ_2 regularization approach improves the training robustness and works for a wider range of neural networks.
READ FULL TEXT
Comments
There are no comments yet.