-
Hierarchical Clustering Supported by Reciprocal Nearest Neighbors
Clustering is a fundamental analysis tool aiming at classifying data poi...
read it
-
Clustering by the way of atomic fission
Cluster analysis which focuses on the grouping and categorization of sim...
read it
-
Mining Contrasting Quasi-Clique Patterns
Mining dense quasi-cliques is a well-known clustering task with applicat...
read it
-
CDF Transform-Shift: An effective way to deal with inhomogeneous density datasets
Many distance-based algorithms exhibit bias towards dense clusters in in...
read it
-
Local and global approaches of affinity propagation clustering for large scale data
Recently a new clustering algorithm called 'affinity propagation' (AP) h...
read it
-
Homophilic Clustering by Locally Asymmetric Geometry
Clustering is indispensable for data analysis in many scientific discipl...
read it
-
Finding Dense Clusters via "Low Rank + Sparse" Decomposition
Finding "densely connected clusters" in a graph is in general an importa...
read it
Robust Unsupervised Mining of Dense Sub-Graphs at Multiple Resolutions
Whereas in traditional partitional clustering, each data point belongs to a cluster, there are several applications where only some of the points form relatively homogenous or “dense” groups, and points that don’t seem to belong to any cluster need to be ignored. Moreover, different clusters may emerge at different scales or density levels. This makes it difficult to identify them using a single density threshold, especially if we also want to ignore the non-clustering data. If data is represented in a metric space, then recent extensions of a classical approach called Hierarchical Mode Analysis (HMA) are able to identify clusters at multiple resolutions, while ignoring “non-dense” areas. However, this approach does not apply when the relations between pairs of data points can only be represented as a (sparse) similarity or affinity graph. Motivated by two complex, real-life applications where one needs to identify dense subgraphs at multiple resolutions, while ignoring nodes that are not well connected in the similarity graph, we introduce a novel algorithm called HIMAG (Hierarchical Incremental Mode Analysis for Graphs) that provides capabilities analogous to HMA based methods but applicable to graphs. We also provide a powerful multi-resolution visualization tool customized for the new algorithm. We present results on the two motivating real-world applications as well as two standard benchmark social graph datasets, to show the power of our approach and compare it with some standard graph partitioning algorithms that were retrofitted to produce dense clusters by pruning non-dense data in a non-trivial manner. We are also open-sourcing the new dense graph datasets and tools to the community.
READ FULL TEXT
Comments
There are no comments yet.