Git: Clustering Based on Graph of Intensity Topology

10/04/2021
by   Zhangyang Gao, et al.
0

Accuracy, Robustness to noises and scales, Interpretability, Speed, and Easy to use (ARISE) are crucial requirements of a good clustering algorithm. However, achieving these goals simultaneously is challenging, and most advanced approaches only focus on parts of them. Towards an overall consideration of these aspects, we propose a novel clustering algorithm, namely GIT (Clustering Based on Graph of Intensity Topology). GIT considers both local and global data structures: firstly forming local clusters based on intensity peaks of samples, and then estimating the global topological graph (topo-graph) between these local clusters. We use the Wasserstein Distance between the predicted and prior class proportions to automatically cut noisy edges in the topo-graph and merge connected local clusters as final clusters. Then, we compare GIT with seven competing algorithms on five synthetic datasets and nine real-world datasets. With fast local cluster detection, robust topo-graph construction and accurate edge-cutting, GIT shows attractive ARISE performance and significantly exceeds other non-convex clustering methods. For example, GIT outperforms its counterparts about 10% (F1-score) on MNIST and FashionMNIST. Code is available at https://github.com/gaozhangyang/GIT.

READ FULL TEXT
research
09/24/2020

Clustering Based on Graph of Density Topology

Data clustering with uneven distribution in high level noise is challeng...
research
09/17/2019

Global Optimal Path-Based Clustering Algorithm

Combinatorial optimization problems for clustering are known to be NP-ha...
research
07/06/2009

Apply Local Clustering Method to Improve the Running Speed of Ant Colony Optimization

Ant Colony Optimization (ACO) has time complexity O(t*m*N*N), and its ty...
research
07/25/2022

On Mitigating Hard Clusters for Face Clustering

Face clustering is a promising way to scale up face recognition systems ...
research
03/12/2018

Clustering with Simultaneous Local and Global View of Data: A message passing based approach

A good clustering algorithm should not only be able to discover clusters...
research
03/02/2023

Image as Set of Points

What is an image and how to extract latent features? Convolutional Netwo...
research
11/23/2021

A Modular Framework for Centrality and Clustering in Complex Networks

The structure of many complex networks includes edge directionality and ...

Please sign up or login with your details

Forgot password? Click here to reset