A Scalable Framework for Sparse Clustering Without Shrinkage

02/20/2020
by   Zhiyue Zhang, et al.
0

Clustering, a fundamental activity in unsupervised learning, is notoriously difficult when the feature space is high-dimensional. Fortunately, in many realistic scenarios, only a handful of features are relevant in distinguishing clusters. This has motivated the development of sparse clustering techniques that typically rely on k-means within outer algorithms of high computational complexity. Current techniques also require careful tuning of shrinkage parameters, further limiting their scalability. In this paper, we propose a novel framework for sparse k-means clustering that is intuitive, simple to implement, and competitive with state-of-the-art algorithms. We show that our algorithm enjoys consistency and convergence guarantees. Our core method readily generalizes to several task-specific algorithms such as clustering on subsets of attributes and in partially observed data settings. We showcase these contributions via simulated experiments and benchmark datasets, as well as a case study on mouse protein expression.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/24/2019

A Strongly Consistent Sparse k-means Clustering with Direct l_1 Penalization on Variable Weights

We propose the Lasso Weighted k-means (LW-k-means) algorithm as a simple...
research
02/07/2023

Sparse GEMINI for Joint Discriminative Clustering and Feature Selection

Feature selection in clustering is a hard task which involves simultaneo...
research
02/23/2016

A Simple Approach to Sparse Clustering

Consider the problem of sparse clustering, where it is assumed that only...
research
03/31/2014

Sparse K-Means with ℓ_∞/ℓ_0 Penalty for High-Dimensional Data Clustering

Sparse clustering, which aims to find a proper partition of an extremely...
research
10/20/2019

Differentiable Deep Clustering with Cluster Size Constraints

Clustering is a fundamental unsupervised learning approach. Many cluster...
research
12/10/2021

Interpretable Clustering via Multi-Polytope Machines

Clustering is a popular unsupervised learning tool often used to discove...
research
08/04/2020

Biconvex Clustering

Convex clustering has recently garnered increasing interest due to its a...

Please sign up or login with your details

Forgot password? Click here to reset