A Strongly Consistent Sparse k-means Clustering with Direct l_1 Penalization on Variable Weights

03/24/2019
by   Saptarshi Chakraborty, et al.
0

We propose the Lasso Weighted k-means (LW-k-means) algorithm as a simple yet efficient sparse clustering procedure for high-dimensional data where the number of features (p) can be much larger compared to the number of observations (n). In the LW-k-means algorithm, we introduce a lasso-based penalty term, directly on the feature weights to incorporate feature selection in the framework of sparse clustering. LW-k-means does not make any distributional assumption of the given dataset and thus, induces a non-parametric method for feature selection. We also analytically investigate the convergence of the underlying optimization procedure in LW-k-means and establish the strong consistency of our algorithm. LW-k-means is tested on several real-life and synthetic datasets and through detailed experimental analysis, we find that the performance of the method is highly competitive against some state-of-the-art procedures for clustering and feature selection, not only in terms of clustering accuracy but also with respect to computational time.

READ FULL TEXT

page 1

page 12

page 16

research
03/31/2014

Sparse K-Means with ℓ_∞/ℓ_0 Penalty for High-Dimensional Data Clustering

Sparse clustering, which aims to find a proper partition of an extremely...
research
02/07/2023

Sparse GEMINI for Joint Discriminative Clustering and Feature Selection

Feature selection in clustering is a hard task which involves simultaneo...
research
09/26/2019

CS Sparse K-means: An Algorithm for Cluster-Specific Feature Selection in High-Dimensional Clustering

Feature selection is an important and challenging task in high dimension...
research
10/29/2020

Post-selection inference with HSIC-Lasso

Detecting influential features in complex (non-linear and/or high-dimens...
research
02/23/2016

A Simple Approach to Sparse Clustering

Consider the problem of sparse clustering, where it is assumed that only...
research
02/20/2020

A Scalable Framework for Sparse Clustering Without Shrinkage

Clustering, a fundamental activity in unsupervised learning, is notoriou...
research
08/04/2020

Biconvex Clustering

Convex clustering has recently garnered increasing interest due to its a...

Please sign up or login with your details

Forgot password? Click here to reset