Clustering with Simultaneous Local and Global View of Data: A message passing based approach

03/12/2018
by   Rayyan Ahmad Khan, et al.
0

A good clustering algorithm should not only be able to discover clusters of arbitrary shapes (global view) but also provide additional information, which can be used to gain more meaningful insights into the internal structure of the clusters (local view). In this work we use the mathematical framework of factor graphs and message passing algorithms to optimize a pairwise similarity based cost function, in the same spirit as was done in Affinity Propagation. Using this framework we develop two variants of a new clustering algorithm, EAP and SHAPE. EAP/SHAPE can not only discover clusters of arbitrary shapes but also provide a rich local view in the form of meaningful local representatives (exemplars) and connections between these local exemplars. We discuss how this local information can be used to gain various insights about the clusters including varying relative cluster densities and indication of local strength in different regions of a cluster . We also discuss how this can help an analyst in discovering and resolving potential inconsistencies in the results. The efficacy of EAP/SHAPE is shown by applying it to various synthetic and real world benchmark datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/17/2019

Global Optimal Path-Based Clustering Algorithm

Combinatorial optimization problems for clustering are known to be NP-ha...
research
12/27/2019

Evolutionary Clustering via Message Passing

We are often interested in clustering objects that evolve over time and ...
research
06/09/2015

Clustering by transitive propagation

We present a global optimization algorithm for clustering data given the...
research
11/14/2018

Communication-Optimal Distributed Dynamic Graph Clustering

We consider the problem of clustering graph nodes over large-scale dynam...
research
03/26/2021

Geometric Affinity Propagation for Clustering with Network Knowledge

Clustering data into meaningful subsets is a major task in scientific da...
research
10/04/2021

Git: Clustering Based on Graph of Intensity Topology

Accuracy, Robustness to noises and scales, Interpretability, Speed, and ...
research
01/23/2020

Towards Automatic Clustering Analysis using Traces of Information Gain: The InfoGuide Method

Clustering analysis has become a ubiquitous information retrieval tool i...

Please sign up or login with your details

Forgot password? Click here to reset