Spectral properties of sample covariance matrices arising from random matrices with independent non identically distributed columns

09/06/2021
by   Cosme Louart, et al.
0

Given a random matrix X= (x_1,…, x_n)∈ℳ_p,n with independent columns and satisfying concentration of measure hypotheses and a parameter z whose distance to the spectrum of 1/n XX^T should not depend on p,n, it was previously shown that the functionals tr(AR(z)), for R(z) = (1/nXX^T- zI_p)^-1 and A∈ℳ_p deterministic, have a standard deviation of order O(A_* / √(n)). Here, we show that 𝔼[R(z)] - R̃(z)_F ≤ O(1/√(n)), where R̃(z) is a deterministic matrix depending only on z and on the means and covariances of the column vectors x_1,…, x_n (that do not have to be identically distributed). This estimation is key to providing accurate fluctuation rates of functionals of X of interest (mostly related to its spectral properties) and is proved thanks to the introduction of a semi-metric d_s defined on the set 𝒟_n(ℍ) of diagonal matrices with complex entries and positive imaginary part and satisfying, for all D,D' ∈𝒟_n(ℍ): d_s(D,D') = max_i∈[n] |D_i - D_i'|/ ((D_i) (D_i'))^1/2. Possibly most importantly, the underlying concentration of measure assumption on the columns of X finds an extremely natural ground for application in modern statistical machine learning algorithms where non-linear Lipschitz mappings and high number of classes form the base ingredients.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset