Feature Grouping and Sparse Principal Component Analysis

by   Haiyan Jiang, et al.

Sparse Principal Component Analysis (SPCA) is widely used in data processing and dimension reduction; it uses the lasso to produce modified principal components with sparse loadings for better interpretability. However, sparse PCA never considers an additional grouping structure where the loadings share similar coefficients (i.e., feature grouping), besides a special group with all coefficients being zero (i.e., feature selection). In this paper, we propose a novel method called Feature Grouping and Sparse Principal Component Analysis (FGSPCA) which allows the loadings to belong to disjoint homogeneous groups, with sparsity as a special case. The proposed FGSPCA is a subspace learning method designed to simultaneously perform grouping pursuit and feature selection, by imposing a non-convex regularization with naturally adjustable sparsity and grouping effect. To solve the resulting non-convex optimization problem, we propose an alternating algorithm that incorporates the difference-of-convex programming, augmented Lagrange and coordinate descent methods. Additionally, the experimental results on real data sets show that the proposed FGSPCA benefits from the grouping effect compared with methods without grouping effect.



There are no comments yet.


page 1

page 2

page 3

page 4


Robust Matrix Factorization with Grouping Effect

Although many techniques have been applied to matrix factorization (MF),...

An Iteratively Re-weighted Method for Problems with Sparsity-Inducing Norms

This work aims at solving the problems with intractable sparsity-inducin...

Supervised Homogeneity Fusion: a Combinatorial Approach

Fusing regression coefficients into homogenous groups can unveil those c...

Principal component-guided sparse regression

We propose a new method for supervised learning, especially suited to wi...

Functional Principal Component Analysis and Randomized Sparse Clustering Algorithm for Medical Image Analysis

Due to advances in sensors, growing large and complex medical image data...

Scalable and Flexible Multiview MAX-VAR Canonical Correlation Analysis

Generalized canonical correlation analysis (GCCA) aims at finding latent...

Sufficient principal component regression for pattern discovery in transcriptomic data

Methods for global measurement of transcript abundance such as microarra...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.