-
The generalization error of random features regression: Precise asymptotics and double descent curve
Deep learning methods operate in regimes that defy the traditional stati...
read it
-
Triple descent and the two kinds of overfitting: Where why do they appear?
A recent line of research has highlighted the existence of a double desc...
read it
-
Understanding overfitting peaks in generalization error: Analytical risk curves for l_2 and l_1 penalized interpolation
Traditionally in regression one minimizes the number of fitting paramete...
read it
-
Rethinking Bias-Variance Trade-off for Generalization of Neural Networks
The classical bias-variance trade-off predicts that bias decreases and v...
read it
-
Regularizing CNNs with Locally Constrained Decorrelations
Regularization is key for deep learning since it allows training more co...
read it
-
The Slow Deterioration of the Generalization Error of the Random Feature Model
The random feature model exhibits a kind of resonance behavior when the ...
read it
-
Avoiding The Double Descent Phenomenon of Random Feature Models Using Hybrid Regularization
We demonstrate the ability of hybrid regularization methods to automatic...
read it
Benign Overfitting and Noisy Features
Modern machine learning often operates in the regime where the number of parameters is much higher than the number of data points, with zero training loss and yet good generalization, thereby contradicting the classical bias-variance trade-off. This benign overfitting phenomenon has recently been characterized using so called double descent curves where the risk undergoes another descent (in addition to the classical U-shaped learning curve when the number of parameters is small) as we increase the number of parameters beyond a certain threshold. In this paper, we examine the conditions under which Benign Overfitting occurs in the random feature (RF) models, i.e. in a two-layer neural network with fixed first layer weights. We adopt a new view of random feature and show that benign overfitting arises due to the noise which resides in such features (the noise may already be present in the data and propagate to the features or it may be added by the user to the features directly) and plays an important implicit regularization role in the phenomenon.
READ FULL TEXT
Comments
There are no comments yet.