Sparse spectral estimation with missing and corrupted measurements
Supervised learning methods with missing data have been extensively studied not just due to the techniques related to low-rank matrix completion. Also in unsupervised learning one often relies on imputation methods. As a matter of fact, missing values induce a bias in various estimators such as the sample covariance matrix. In the present paper, a convex method for sparse subspace estimation is extended to the case of missing and corrupted measurements. This is done by correcting the bias instead of imputing the missing values. The estimator is then used as an initial value for a nonconvex procedure to improve the overall statistical performance. The methodological as well as theoretical frameworks are applied to a wide range of statistical problems. These include sparse Principal Component Analysis with different types of randomly missing data and the estimation of eigenvectors of low-rank matrices with missing values. Finally, the statistical performance is demonstrated on synthetic data.
READ FULL TEXT