Clustering with feature selection using alternating minimization, Application to computational biology

11/08/2017
by   Cyprien Gilet, et al.
0

This paper deals with unsupervised clustering with feature selection. The problem is to estimate both labels and a sparse projection matrix of weights. To address this combinatorial non-convex problem maintaining a strict control on the sparsity of the matrix of weights, we propose an alternating minimization of the Frobenius norm criterion. We provide a new efficient algorithm named K-sparse which alternates k-means with projection-gradient minimization. The projection-gradient step is a method of splitting type, with exact projection on the ℓ^1 ball to promote sparsity. The convergence of the gradient-projection step is addressed, and a preliminary analysis of the alternating minimization is made. The Frobenius norm criterion converges as the number of iterates in Algorithm K-sparse goes to infinity. Experiments on Single Cell RNA sequencing datasets show that our method significantly improves the results of PCA k-means, spectral clustering, SIMLR, and Sparcl methods, and achieves a relevant selection of genes. The complexity of K-sparse is linear in the number of samples (cells), so that the method scales up to large datasets.

READ FULL TEXT

page 7

page 9

research
07/19/2023

Near-Linear Time Projection onto the ℓ_1,∞ Ball; Application to Sparse Autoencoders

Looking for sparsity is nowadays crucial to speed up the training of lar...
research
12/29/2020

Sparse PCA via l_2,p-Norm Regularization for Unsupervised Feature Selection

In the field of data mining, how to deal with high-dimensional data is a...
research
06/06/2015

Classification and regression using an outer approximation projection-gradient method

This paper deals with sparse feature selection and grouping for classifi...
research
01/10/2020

Probabilistic K-means Clustering via Nonlinear Programming

K-means is a classical clustering algorithm with wide applications. Howe...
research
04/10/2019

New Computational and Statistical Aspects of Regularized Regression with Application to Rare Feature Selection and Aggregation

Prior knowledge on properties of a target model often come as discrete o...
research
05/07/2015

Fast Spectral Unmixing based on Dykstra's Alternating Projection

This paper presents a fast spectral unmixing algorithm based on Dykstra'...
research
07/01/2021

Semi-Sparsity for Smoothing Filters

In this paper, we propose an interesting semi-sparsity smoothing algorit...

Please sign up or login with your details

Forgot password? Click here to reset