Robust and Sample Optimal Algorithms for PSD Low-Rank Approximation
Recently, Musco and Woodruff (FOCS, 2017) showed that given an n × n positive semidefinite (PSD) matrix A, it is possible to compute a relative-error (1+ϵ)-approximate low-rank approximation to A by querying O(nk/ϵ^2.5) entries of A in time O(nk/ϵ^2.5 +n k^ω-1/ϵ^2(ω-1)). They also showed that any relative-error low-rank approximation algorithm must query Ω(nk/ϵ) entries of A, and closing this gap is an important open question. Our main result is to resolve this question by showing an algorithm that queries an optimal O(nk/ϵ) entries of A and outputs a relative-error low-rank approximation in O(n·(k/ϵ)^ω-1) time. Note, our running time improves that of Musco and Woodruff, and matches the information-theoretic lower bound if the matrix-multiplication exponent ω is 2. Next, we introduce a new robust low-rank approximation model which captures PSD matrices that have been corrupted with noise. We assume that the Frobenius norm of the corruption is bounded. Here, we relax the notion of approximation to additive-error, since it is information-theoretically impossible to obtain a relative-error approximation in this setting. While a sample complexity lower bound precludes sublinear algorithms for arbitrary PSD matices, we provide the first sublinear time and query algorithms when the corruption on the diagonal entries is bounded. As a special case, we show sample-optimal sublinear time algorithms for low-rank approximation of correlation matrices corrupted by noise.
READ FULL TEXT