Enhanced Ensemble Clustering via Fast Propagation of Cluster-wise Similarities

10/30/2018
by   Dong Huang, et al.
0

Ensemble clustering has been a popular research topic in data mining and machine learning. Despite its significant progress in recent years, there are still two challenging issues in the current ensemble clustering research. First, most of the existing algorithms tend to investigate the ensemble information at the object-level, yet often lack the ability to explore the rich information at higher levels of granularity. Second, they mostly focus on the direct connections (e.g., direct intersection or pair-wise co-occurrence) in the multiple base clusterings, but generally neglect the multi-scale indirect relationship hidden in them. To address these two issues, this paper presents a novel ensemble clustering approach based on fast propagation of cluster-wise similarities via random walks. We first construct a cluster similarity graph with the base clusters treated as graph nodes and the cluster-wise Jaccard coefficient exploited to compute the initial edge weights. Upon the constructed graph, a transition probability matrix is defined, based on which the random walk process is conducted to propagate the graph structural information. Specifically, by investigating the propagating trajectories starting from different nodes, a new cluster-wise similarity matrix can be derived by considering the trajectory relationship. Then, the newly obtained cluster-wise similarity matrix is mapped from the cluster-level to the object-level to achieve an enhanced co-association (ECA) matrix, which is able to simultaneously capture the object-wise co-occurrence relationship as well as the multi-scale cluster-wise relationship in ensembles. Finally, two novel consensus functions are proposed to obtain the consensus clustering result. Extensive experiments on a variety of real-world datasets have demonstrated the effectiveness and efficiency of our approach.

READ FULL TEXT

page 1

page 13

research
06/03/2016

Robust Ensemble Clustering Using Probability Trajectories

Although many successful ensemble clustering approaches have been develo...
research
05/06/2014

Combining Multiple Clusterings via Crowd Agreement Estimation and Multi-Granularity Link Analysis

The clustering ensemble technique aims to combine multiple clusterings i...
research
08/05/2014

Determining the Number of Clusters via Iterative Consensus Clustering

We use a cluster ensemble to determine the number of clusters, k, in a g...
research
10/22/2020

Cluster-and-Conquer: When Randomness Meets Graph Locality

K-Nearest-Neighbors (KNN) graphs are central to many emblematic data min...
research
08/05/2014

A Flexible Iterative Framework for Consensus Clustering

A novel framework for consensus clustering is presented which has the ab...
research
05/12/2022

Ensemble Clustering via Co-association Matrix Self-enhancement

Ensemble clustering integrates a set of base clustering results to gener...
research
05/28/2023

Overlapping and Robust Edge-Colored Clustering in Hypergraphs

A recent trend in data mining has explored (hyper)graph clustering algor...

Please sign up or login with your details

Forgot password? Click here to reset