Clustering Based on Graph of Density Topology

09/24/2020
by   Zhangyang Gao, et al.
45

Data clustering with uneven distribution in high level noise is challenging. Currently, HDBSCAN is considered as the SOTA algorithm for this problem. In this paper, we propose a novel clustering algorithm based on what we call graph of density topology (GDT). GDT jointly considers the local and global structures of data samples: firstly forming local clusters based on a density growing process with a strategy for properly noise handling as well as cluster boundary detection; and then estimating a GDT from relationship between local clusters in terms of a connectivity measure, givingglobal topological graph. The connectivity, measuring similarity between neighboring local clusters, is based on local clusters rather than individual points, ensuring its robustness to even very large noise. Evaluation results on both toy and real-world datasets show that GDT achieves the SOTA performance by far on almost all the popular datasets, and has a low time complexity of O(nlogn). The code is available at https://github.com/gaozhangyang/DGC.git.

READ FULL TEXT

page 3

page 10

page 17

page 18

research
10/04/2021

Git: Clustering Based on Graph of Intensity Topology

Accuracy, Robustness to noises and scales, Interpretability, Speed, and ...
research
09/17/2019

Global Optimal Path-Based Clustering Algorithm

Combinatorial optimization problems for clustering are known to be NP-ha...
research
05/20/2023

GFDC: A Granule Fusion Density-Based Clustering with Evidential Reasoning

Currently, density-based clustering algorithms are widely applied becaus...
research
08/09/2018

α-Approximation Density-based Clustering of Multi-valued Objects

Multi-valued data are commonly found in many real applications. During t...
research
07/25/2022

On Mitigating Hard Clusters for Face Clustering

Face clustering is a promising way to scale up face recognition systems ...
research
05/20/2017

Accelerated Hierarchical Density Clustering

We present an accelerated algorithm for hierarchical density based clust...
research
06/13/2023

PaVa: a novel Path-based Valley-seeking clustering algorithm

Clustering methods are being applied to a wider range of scenarios invol...

Please sign up or login with your details

Forgot password? Click here to reset