Regularization in High-Dimensional Regression and Classification via Random Matrix Theory

03/30/2020
by   Panagiotis Lolas, et al.
0

We study general singular value shrinkage estimators in high-dimensional regression and classification, when the number of features and the sample size both grow proportionally to infinity. We allow models with general covariance matrices that include a large class of data generating distributions. As far as the implications of our results are concerned, we find exact asymptotic formulas for both the training and test errors in regression models fitted by gradient descent, which provides theoretical insights for early stopping as a regularization method. In addition, we propose a numerical method based on the empirical spectra of covariance matrices for the optimal eigenvalue shrinkage classifier in linear discriminant analysis. Finally, we derive optimal estimators for the dense mean vectors of high-dimensional distributions. Throughout our analysis we rely on recent advances in random matrix theory and develop further results of independent mathematical interest.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/22/2021

Optimal Linear Classification via Eigenvalue Shrinkage: The Case of Additive Noise

In this paper, we consider the general problem of testing the mean of tw...
research
02/25/2022

On singular values of large dimensional lag-tau sample autocorrelation matrices

We study the limiting behavior of singular values of a lag-τ sample auto...
research
07/07/2022

Optimal shrinkage of singular values under high-dimensional noise with separable covariance structure

We consider an optimal shrinkage algorithm that depends on an effective ...
research
11/28/2022

High dimensional discriminant rules with shrinkage estimators of covariance matrix and mean vector

Linear discriminant analysis is a typical method used in the case of lar...
research
11/04/2022

Projection inference for high-dimensional covariance matrices with structured shrinkage targets

Analyzing large samples of high-dimensional data under dependence is a c...
research
11/28/2022

Double Data Piling for Heterogeneous Covariance Models

In this work, we characterize two data piling phenomenon for a high-dime...
research
08/04/2022

Spectral Universality of Regularized Linear Regression with Nearly Deterministic Sensing Matrices

It has been observed that the performances of many high-dimensional esti...

Please sign up or login with your details

Forgot password? Click here to reset