PCA in Data-Dependent Noise (Correlated-PCA): Nearly Optimal Finite Sample Guarantees
We study Principal Component Analysis (PCA) in a setting where a part of the corrupting noise is data-dependent and, as a result, the noise and the true data are correlated. Under a bounded-ness assumption on the true data and the noise, and a simple assumption on data-noise correlation, we obtain a nearly optimal sample complexity bound for the most commonly used PCA solution, singular value decomposition (SVD). This bound is a significant improvement over the bound obtained by Vaswani and Guo in recent work (NIPS 2016) where this "correlated-PCA" problem was first studied; and it holds under a significantly weaker data-noise correlation assumption than the one used for this earlier result.
READ FULL TEXT