Community detection in edge-labeled graphs

by   Iiro Kumpulainen, et al.

Finding dense communities in networks is a widely-used tool for analysis in graph mining. A popular choice for finding such communities is to find subgraphs with a high average degree. While useful, interpreting such subgraphs may be difficult. On the other hand, many real-world networks have additional information, and we are specifically interested in networks that have labels on edges. In this paper, we study finding dense subgraphs that can be explained with the labels on edges. More specifically, we are looking for a set of labels so that the induced subgraph has a high average degree. There are many ways to induce a subgraph from a set of labels, and we study two cases: First, we study conjunctive-induced dense subgraphs, where the subgraph edges need to have all labels. Secondly, we study disjunctive-induced dense subgraphs, where the subgraph edges need to have at least one label. We show that both problems are NP-hard. Because of the hardness, we resort to greedy heuristics. We show that we can implement the greedy search efficiently: the respective running times for finding conjunctive-induced and disjunctive-induced dense subgraphs are in 𝒪(p log k) and 𝒪(p log^2 k), where p is the number of edge-label pairs and k is the number of labels. Our experimental evaluation demonstrates that we can find the ground truth in synthetic graphs and that we can find interpretable subgraphs from real-world networks.


