
Ridge Regression: Structure, CrossValidation, and Sketching
We study the following three fundamental problems about ridge regression...
read it

Benign overfitting in ridge regression
Classical learning theory suggests that strong regularization is needed ...
read it

Weighted Orthogonal Components Regression Analysis
In the multiple linear regression setting, we propose a general framewor...
read it

Ī»Regularized AOptimal Design and its Approximation by Ī»Regularized Proportional Volume Sampling
In this work, we study the Ī»regularized Aoptimal design problem and in...
read it

Oneshot distributed ridge regression in high dimensions
In many areas, practitioners need to analyze large datasets that challen...
read it

Online Forgetting Process for Linear Regression Models
Motivated by the EU's "Right To Be Forgotten" regulation, we initiate a ...
read it

The Implicit Regularization of Ordinary Least Squares Ensembles
Ensemble methods that average over a collection of independent predictor...
read it
On the Optimal Weighted ā_2 Regularization in Overparameterized Linear Regression
We consider the linear model š² = šĪ²_ā + Ļµ with šāā^nĆ p in the overparameterized regime p>n. We estimate Ī²_ā via generalized (weighted) ridge regression: Ī²Ģ_Ī» = (š^Tš + Ī»Ī£_w)^ā š^Tš², where Ī£_w is the weighting matrix. Assuming a random effects model with general data covariance Ī£_x and anisotropic prior on the true coefficients Ī²_ā, i.e., š¼Ī²_āĪ²_ā^T = Ī£_Ī², we provide an exact characterization of the prediction risk š¼(yš±^TĪ²Ģ_Ī»)^2 in the proportional asymptotic limit p/nāĪ³ā (1,ā). Our general setup leads to a number of interesting findings. We outline precise conditions that decide the sign of the optimal setting Ī»_ opt for the ridge parameter Ī» and confirm the implicit ā_2 regularization effect of overparameterization, which theoretically justifies the surprising empirical observation that Ī»_ opt can be negative in the overparameterized regime. We also characterize the double descent phenomenon for principal component regression (PCR) when š and Ī²_ā are nonisotropic. Finally, we determine the optimal Ī£_w for both the ridgeless (Ī»ā 0) and optimally regularized (Ī» = Ī»_ opt) case, and demonstrate the advantage of the weighted objective over standard ridge regression and PCR.
READ FULL TEXT
Comments
There are no comments yet.