Co-clustering for directed graphs: the Stochastic co-Blockmodel and spectral algorithm Di-Sim

04/10/2012
by   Karl Rohe, et al.
0

Directed graphs have asymmetric connections, yet the current graph clustering methodologies cannot identify the potentially global structure of these asymmetries. We give a spectral algorithm called di-sim that builds on a dual measure of similarity that correspond to how a node (i) sends and (ii) receives edges. Using di-sim, we analyze the global asymmetries in the networks of Enron emails, political blogs, and the c elegans neural connectome. In each example, a small subset of nodes have persistent asymmetries; these nodes send edges with one cluster, but receive edges with another cluster. Previous approaches would have assigned these asymmetric nodes to only one cluster, failing to identify their sending/receiving asymmetries. Regularization and "projection" are two steps of di-sim that are essential for spectral clustering algorithms to work in practice. The theoretical results show that these steps make the algorithm weakly consistent under the degree corrected Stochastic co-Blockmodel, a model that generalizes the Stochastic Blockmodel to allow for both (i) degree heterogeneity and (ii) the global asymmetries that we intend to detect. The theoretical results make no assumptions on the smallest degree nodes. Instead, the theorem requires that the average degree grows sufficiently fast and that the weak consistency only applies to the subset of the nodes with sufficiently large leverage scores. The results results also apply to bipartite graphs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/25/2020

Randomized spectral co-clustering for large-scale directed networks

Directed networks are generally used to represent asymmetric relationshi...
research
09/21/2021

Consistency of spectral clustering for directed network community detection

Directed networks appear in various areas, such as biology, sociology, p...
research
10/03/2021

Fast algorithm to identify cluster synchrony through fibration symmetries in large information-processing networks

Recent studies revealed an important interplay between the detailed stru...
research
08/06/2019

Hermitian matrices for clustering directed graphs: insights and applications

Graph clustering is a basic technique in machine learning, and has wides...
research
10/04/2019

Targeted sampling from massive Blockmodel graphs with personalized PageRank

This paper provides statistical theory and intuition for Personalized Pa...
research
04/14/2023

Strong Consistency Guarantees for Clustering High-Dimensional Bipartite Graphs with the Spectral Method

In this work, we focus on the Bipartite Stochastic Block Model (BiSBM), ...
research
02/18/2020

Latent Poisson models for networks with heterogeneous density

Empirical networks are often globally sparse, with a small average numbe...

Please sign up or login with your details

Forgot password? Click here to reset