Well-Connected Communities in Real-World and Synthetic Networks
Integral to the problem of detecting communities through graph clustering is the expectation that they are "well connected". In this respect, we examine five different community detection approaches optimizing different criteria: the Leiden algorithm optimizing the Constant Potts Model, the Leiden algorithm optimizing modularity, Iterative K-Core Clustering (IKC), Infomap, and Markov Clustering (MCL). Surprisingly, all these methods produce, to varying extents, communities that fail even a mild requirement for well connectedness. To remediate clusters that are not well connected, we have developed the "Connectivity Modifier" (CM), which, at the cost of coverage, iteratively removes small edge cuts and re-clusters until all communities produced are well connected. Results from real-world and synthetic networks illustrate a tradeoff users make between well connected clusters and coverage, and raise questions about the "clusterability" of networks and models of community structure.
READ FULL TEXT