Entrywise Estimation of Singular Vectors of Low-Rank Matrices with Heteroskedasticity and Dependence

by   Joshua Agterberg, et al.

We propose an estimator for the singular vectors of high-dimensional low-rank matrices corrupted by additive subgaussian noise, where the noise matrix is allowed to have dependence within rows and heteroskedasticity between them. We prove finite-sample ℓ_2,∞ bounds and a Berry-Esseen theorem for the individual entries of the estimator, and we apply these results to high-dimensional mixture models. Our Berry-Esseen theorem clearly shows the geometric relationship between the signal matrix, the covariance structure of the noise, and the distribution of the errors in the singular vector estimation task. These results are illustrated in numerical simulations. Unlike previous results of this type, which rely on assumptions of gaussianity or independence between the entries of the additive noise, handling the dependence between entries in the proofs of these results requires careful leave-one-out analysis and conditioning arguments. Our results depend only on the signal-to-noise ratio, the sample size, and the spectral properties of the signal matrix.


page 1

page 2

page 3

page 4


Optimal shrinkage of singular values under high-dimensional noise with separable covariance structure

We consider an optimal shrinkage algorithm that depends on an effective ...

Long Random Matrices and Tensor Unfolding

In this paper, we consider the singular values and singular vectors of l...

On the detection of low rank matrices in the high-dimensional regime

We address the detection of a low rank n× ndeterministic matrix X_0 from...

Global testing under the sparse alternatives for single index models

For the single index model y=f(β^τx,ϵ) with Gaussian design, and β is a...

A random algorithm for low-rank decomposition of large-scale matrices with missing entries

A Random SubMatrix method (RSM) is proposed to calculate the low-rank de...

CDPA: Common and Distinctive Pattern Analysis between High-dimensional Datasets

A representative model in integrative analysis of two high-dimensional d...

Sample Efficient Toeplitz Covariance Estimation

We study the query complexity of estimating the covariance matrix T of a...