Optimal Eigenvalue Shrinkage in the Semicircle Limit

10/10/2022
by   David L. Donoho, et al.
0

Recent studies of high-dimensional covariance estimation often assume the proportional growth asymptotic, where the sample size n and dimension p are comparable, with n, p →∞ and γ_n ≡ p/n →γ > 0. Yet, many datasets have very different numbers of rows and columns. Consider instead disproportional growth, where n, p →∞ and γ_n → 0 or γ_n →∞. With far fewer dimensions than observations, the disproportional limit γ_n → 0 may seem similar to classical fixed-p asymptotics. In fact, either disproportional limit induces novel phenomena distinct from the proportional and fixed-p limits. We study the spiked covariance model, finding for each of 15 different loss functions optimal shrinkage and thresholding rules. Readers who initially view the disproportionate limit γ_n → 0 as similar to classical fixed-p asymptotics may expect, given the dominance in that setting of the sample covariance estimator, that there is no alternative here. On the contrary, our optimal procedures demand extensive eigenvalue shrinkage and offer substantial performance benefits. The sample covariance is similarly improvable in the disproportionate limit γ_n →∞. Practitioners may worry how to choose between proportional and disproportional growth frameworks in practice. Conveniently, under the spiked covariance model there is no conflict between the two and no choice is needed; one unified set of closed forms (used with the aspect ratio γ_n of the practitioner's data) offers full asymptotic optimality in both regimes. At the heart of these phenomena is the spiked Wigner model. Via a connection to the spiked covariance model as γ_n → 0, we derive optimal shrinkers for the Wigner setting.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/22/2021

Optimal Linear Classification via Eigenvalue Shrinkage: The Case of Additive Noise

In this paper, we consider the general problem of testing the mean of tw...
research
12/05/2014

Multi-Target Shrinkage

Stein showed that the multivariate sample mean is outperformed by "shrin...
research
09/05/2021

James-Stein estimation of the first principal component

The Stein paradox has played an influential role in the field of high di...
research
10/17/2018

Optimal Covariance Estimation for Condition Number Loss in the Spiked Model

We study estimation of the covariance matrix under relative condition nu...
research
07/30/2020

Covariance estimation with nonnegative partial correlations

We study the problem of high-dimensional covariance estimation under the...
research
11/06/2017

Asymptotics for high-dimensional covariance matrices and quadratic forms with applications to the trace functional and shrinkage

We establish large sample approximations for an arbitray number of bilin...

Please sign up or login with your details

Forgot password? Click here to reset