Data ultrametricity and clusterability

08/28/2019
by   Dan Simovici, et al.
10

The increasing needs of clustering massive datasets and the high cost of running clustering algorithms poses difficult problems for users. In this context it is important to determine if a data set is clusterable, that is, it may be partitioned efficiently into well-differentiated groups containing similar objects. We approach data clusterability from an ultrametric-based perspective. A novel approach to determine the ultrametricity of a dataset is proposed via a special type of matrix product, which allows us to evaluate the clusterability of the dataset. Furthermore, we show that by applying our technique to a dissimilarity space will generate the sub-dominant ultrametric of the dissimilarity.

READ FULL TEXT

page 10

page 11

research
06/19/2021

A Generic Distributed Clustering Framework for Massive Data

In this paper, we introduce a novel Generic distributEd clustEring frame...
research
12/30/2017

Particle Clustering Machine: A Dynamical System Based Approach

Identification of the clusters from an unlabeled data set is one of the ...
research
03/04/2019

Ultra-Scalable Spectral Clustering and Ensemble Clustering

This paper focuses on scalability and robustness of spectral clustering ...
research
07/12/2023

Interpreting deep embeddings for disease progression clustering

We propose a novel approach for interpreting deep embeddings in the cont...
research
10/29/2017

Complexity Analysis Approach for Prefabricated Construction Products Using Uncertain Data Clustering

This paper proposes an uncertain data clustering approach to quantitativ...
research
12/22/2014

Clustering multi-way data: a novel algebraic approach

In this paper, we develop a method for unsupervised clustering of two-wa...

Please sign up or login with your details

Forgot password? Click here to reset