Dynamic Principal Subspaces with Sparsity in High Dimensions
Principal component analysis (PCA) is a versatile tool to reduce the dimensionality which has wide applications in statistics and machine learning community. It is particularly useful to model data in high-dimensional scenarios where the number of variables p is comparable to, or much larger than the sample size n. Despite extensive literature on this topic, researches have focused on modeling static principal eigenvectors or subspaces, which is unsuitable for stochastic processes that are dynamic in nature. To characterize the change in the whole course of high-dimensional data collection, we propose a unified framework to estimate dynamic principal subspaces spanned by leading eigenvectors of covariance matrices. In the proposed framework, we formulate an optimization problem by combining the kernel smoothing and regularization penalty together with the orthogonality constraint, which can be effectively solved by the proximal gradient method for manifold optimization. We show that our method is suitable for high-dimensional data observed under both common and irregular designs. In addition, theoretical properties of the estimators are investigated under l_q (0 ≤ q ≤ 1) sparsity. Extensive experiments demonstrate the effectiveness of the proposed method in both simulated and real data examples.
READ FULL TEXT