Ultra-Scalable Spectral Clustering and Ensemble Clustering

03/04/2019
by   Dong Huang, et al.
0

This paper focuses on scalability and robustness of spectral clustering for extremely large-scale datasets with limited resources. Two novel algorithms are proposed, namely, ultra-scalable spectral clustering (U-SPEC) and ultra-scalable ensemble clustering (U-SENC). In U-SPEC, a hybrid representative selection strategy and a fast approximation method for K-nearest representatives are proposed for the construction of a sparse affinity sub-matrix. By interpreting the sparse sub-matrix as a bipartite graph, the transfer cut is then utilized to efficiently partition the graph and obtain the clustering result. In U-SENC, multiple U-SPEC clusterers are further integrated into an ensemble clustering framework to enhance the robustness of U-SPEC while maintaining high efficiency. Based on the ensemble generation via multiple U-SEPC's, a new bipartite graph is constructed between objects and base clusters and then efficiently partitioned to achieve the consensus clustering result. It is noteworthy that both U-SPEC and U-SENC have nearly linear time and space complexity, and are capable of robustly and efficiently partitioning ten-million-level nonlinearly-separable datasets on a PC with 64GB memory. Experiments on various large-scale datasets have demonstrated the scalability and robustness of our algorithms. The MATLAB code and experimental data are available at https://www.researchgate.net/publication/330760669.

READ FULL TEXT

page 8

page 13

page 15

research
06/18/2021

LSEC: Large-scale spectral ensemble clustering

Ensemble clustering is a fundamental problem in the machine learning fie...
research
04/30/2021

Divide-and-conquer based Large-Scale Spectral Clustering

Spectral clustering is one of the most popular clustering methods. Howev...
research
05/12/2023

One-step Bipartite Graph Cut: A Normalized Formulation and Its Application to Scalable Subspace Clustering

The bipartite graph structure has shown its promising ability in facilit...
research
11/22/2022

Scalable and Effective Conductance-based Graph Clustering

Conductance-based graph clustering has been recognized as a fundamental ...
research
08/28/2019

Data ultrametricity and clusterability

The increasing needs of clustering massive datasets and the high cost of...
research
07/17/2023

Snapshot Spectral Clustering – a costless approach to deep clustering ensembles generation

Despite tremendous advancements in Artificial Intelligence, learning fro...
research
02/04/2019

Self-Tuning Spectral Clustering for Adaptive Tracking Areas Design in 5G Ultra-Dense Networks

In this paper, we address the issue of automatic tracking areas (TAs) pl...

Please sign up or login with your details

Forgot password? Click here to reset