Subspace clustering without knowing the number of clusters: A parameter free approach

09/10/2019
by   Vishnu Menon, et al.
0

Subspace clustering, the task of clustering high dimensional data when the data points come from a union of subspaces is one of the fundamental tasks in unsupervised machine learning. Most of the existing algorithms for this task involves supplying prior information in form of a parameter, like the number of clusters, to the algorithm. In this work, a parameter free method for subspace clustering is proposed, where the data points are clustered on the basis of the difference in statistical distribution of the angles made by the data points within a subspace and those by points belonging to different subspaces. Given an initial coarse clustering, the proposed algorithm merges the clusters until a true clustering is obtained. This, unlike many existing methods, does not involve the use of an unknown parameter or tuning for one through cross validation. Also, a parameter free method for producing a coarse initial clustering is discussed, which makes the whole process of subspace clustering parameter free. The comparison of algorithm performance with the existing state of the art in synthetic and real data sets, shows the significance of the proposed method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/19/2021

Beyond Linear Subspace Clustering: A Comparative Study of Nonlinear Manifold Clustering Algorithms

Subspace clustering is an important unsupervised clustering approach. It...
research
09/26/2013

Determinantal Clustering Processes - A Nonparametric Bayesian Approach to Kernel Based Semi-Supervised Clustering

Semi-supervised clustering is the task of clustering data points into cl...
research
12/30/2017

Particle Clustering Machine: A Dynamical System Based Approach

Identification of the clusters from an unlabeled data set is one of the ...
research
09/14/2017

Subspace Clustering using Ensembles of K-Subspaces

We present a novel approach to the subspace clustering problem that leve...
research
09/14/2019

A highly likely clusterable data model with no clusters

We propose a model for a dataset in R^D that does not contain any clust...
research
04/08/2018

Dimensionality's Blessing: Clustering Images by Underlying Distribution

Many high dimensional vector distances tend to a constant. This is typic...

Please sign up or login with your details

Forgot password? Click here to reset