Randomly Projected Convex Clustering Model: Motivation, Realization, and Cluster Recovery Guarantees

03/29/2023
by   Ziwen Wang, et al.
0

In this paper, we propose a randomly projected convex clustering model for clustering a collection of n high dimensional data points in ℝ^d with K hidden clusters. Compared to the convex clustering model for clustering original data with dimension d, we prove that, under some mild conditions, the perfect recovery of the cluster membership assignments of the convex clustering model, if exists, can be preserved by the randomly projected convex clustering model with embedding dimension m = O(ϵ^-2log(n)), where 0 < ϵ < 1 is some given parameter. We further prove that the embedding dimension can be improved to be O(ϵ^-2log(K)), which is independent of the number of data points. Extensive numerical experiment results will be presented in this paper to demonstrate the robustness and superior performance of the randomly projected convex clustering model. The numerical results presented in this paper also demonstrate that the randomly projected convex clustering model can outperform the randomly projected K-means model in practice.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/21/2020

Clustering small datasets in high-dimension by random projection

Datasets in high-dimension do not typically form clusters in their origi...
research
05/18/2021

On Convex Clustering Solutions

Convex clustering is an attractive clustering algorithm with favorable p...
research
10/04/2018

Convex Clustering: Model, Theoretical Guarantee and Efficient Algorithm

Clustering is a fundamental problem in unsupervised learning. Popular me...
research
01/18/2016

Sparse Convex Clustering

Convex clustering, a convex relaxation of k-means clustering and hierarc...
research
11/08/2019

Convex Hierarchical Clustering for Graph-Structured Data

Convex clustering is a recent stable alternative to hierarchical cluster...
research
04/06/2022

Consensual Aggregation on Random Projected High-dimensional Features for Regression

In this paper, we present a study of a kernel-based consensual aggregati...
research
03/22/2020

Deep Synthetic Minority Over-Sampling Technique

Synthetic Minority Over-sampling Technique (SMOTE) is the most popular o...

Please sign up or login with your details

Forgot password? Click here to reset