Avoiding The Double Descent Phenomenon of Random Feature Models Using Hybrid Regularization

12/11/2020
by   Kelvin Kan, et al.
0

We demonstrate the ability of hybrid regularization methods to automatically avoid the double descent phenomenon arising in the training of random feature models (RFM). The hallmark feature of the double descent phenomenon is a spike in the regularization gap at the interpolation threshold, i.e. when the number of features in the RFM equals the number of training samples. To close this gap, the hybrid method considered in our paper combines the respective strengths of the two most common forms of regularization: early stopping and weight decay. The scheme does not require hyperparameter tuning as it automatically selects the stopping iteration and weight decay hyperparameter by using generalized cross-validation (GCV). This also avoids the necessity of a dedicated validation set. While the benefits of hybrid methods have been well-documented for ill-posed inverse problems, our work presents the first use case in machine learning. To expose the need for regularization and motivate hybrid methods, we perform detailed numerical experiments inspired by image classification. In those examples, the hybrid scheme successfully avoids the double descent phenomenon and yields RFMs whose generalization is comparable with classical regularization approaches whose hyperparameters are tuned optimally using the test data. We provide our MATLAB codes for implementing the numerical experiments in this paper at https://github.com/EmoryMLIP/HybridRFM.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/04/2020

Optimal Regularization Can Mitigate Double Descent

Recent empirical and theoretical studies have shown that many learning a...
research
07/26/2023

Sparse Double Descent in Vision Transformers: real or phantom threat?

Vision transformers (ViT) have been of broad interest in recent theoreti...
research
08/25/2021

Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization

Adaptive gradient methods such as Adam have gained increasing popularity...
research
05/25/2023

Double Descent of Discrepancy: A Task-, Data-, and Model-Agnostic Phenomenon

In this paper, we studied two identically-trained neural networks (i.e. ...
research
04/06/2022

Double Descent in Random Feature Models: Precise Asymptotic Analysis for General Convex Regularization

We prove rigorous results on the double descent phenomenon in random fea...
research
02/18/2022

Geometric Regularization from Overparameterization explains Double Descent and other findings

The volume of the distribution of possible weight configurations associa...
research
05/28/2019

LambdaOpt: Learn to Regularize Recommender Models in Finer Levels

Recommendation models mainly deal with categorical variables, such as us...

Please sign up or login with your details

Forgot password? Click here to reset