Efficient L1-Norm Principal-Component Analysis via Bit Flipping

10/06/2016
by   Panos P. Markopoulos, et al.
0

It was shown recently that the K L1-norm principal components (L1-PCs) of a real-valued data matrix X ∈ R^D × N (N data samples of D dimensions) can be exactly calculated with cost O(2^NK) or, when advantageous, O(N^dK - K + 1) where d=rank( X), K<d [1],[2]. In applications where X is large (e.g., "big" data of large N and/or "heavy" data of large d), these costs are prohibitive. In this work, we present a novel suboptimal algorithm for the calculation of the K < d L1-PCs of X of cost O(ND min{ N,D} + N^2(K^4 + dK^2) + dNK^3), which is comparable to that of standard (L2-norm) PC analysis. Our theoretical and experimental studies show that the proposed algorithm calculates the exact optimal L1-PCs with high frequency and achieves higher value in the L1-PC optimization metric than any known alternative algorithm of comparable computational cost. The superiority of the calculated L1-PCs over standard L2-PCs (singular vectors) in characterizing potentially faulty data/measurements is demonstrated with experiments on data dimensionality reduction and disease diagnosis from genomic data.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset