Mining Large Quasi-cliques with Quality Guarantees from Vertex Neighborhoods

08/18/2020
by   Aritra Konar, et al.
0

Mining dense subgraphs is an important primitive across a spectrum of graph-mining tasks. In this work, we formally establish that two recurring characteristics of real-world graphs, namely heavy-tailed degree distributions and large clustering coefficients, imply the existence of substantially large vertex neighborhoods with high edge-density. This observation suggests a very simple approach for extracting large quasi-cliques: simply scan the vertex neighborhoods, compute the clustering coefficient of each vertex, and output the best such subgraph. The implementation of such a method requires counting the triangles in a graph, which is a well-studied problem in graph mining. When empirically tested across a number of real-world graphs, this approach reveals a surprise: vertex neighborhoods include maximal cliques of non-trivial sizes, and the density of the best neighborhood often compares favorably to subgraphs produced by dedicated algorithms for maximizing subgraph density. For graphs with small clustering coefficients, we demonstrate that small vertex neighborhoods can be refined using a local-search method to “grow” larger cliques and near-cliques. Our results indicate that contrary to worst-case theoretical results, mining cliques and quasi-cliques of non-trivial sizes from real-world graphs is often not a difficult problem, and provides motivation for further work geared towards a better explanation of these empirical successes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/03/2020

Finding Densest k-Connected Subgraphs

Dense subgraph discovery is an important graph-mining primitive with a v...
research
11/24/2019

Efficiently Counting Vertex Orbits of All 5-vertex Subgraphs, by EVOKE

Subgraph counting is a fundamental task in network analysis. Typically, ...
research
11/12/2022

On maximal 3-edge-connected subgraphs of undirected graphs

We show how to find and efficiently maintain maximal 3-edge-connected su...
research
08/13/2018

Large Graph Exploration via Subgraph Discovery and Decomposition

We are developing an interactive graph exploration system called Graph P...
research
04/30/2020

Scalable Mining of Maximal Quasi-Cliques: An Algorithm-System Codesign Approach

Given a user-specified minimum degree threshold γ, a γ-quasi-clique is a...
research
09/02/2018

Mining Frequent Patterns in Evolving Graphs

Given a labeled graph, the frequent-subgraph mining (FSM) problem asks t...
research
05/18/2017

Taming Near Repeat Calculation for Crime Analysis via Cohesive Subgraph Computing

Near repeat (NR) is a well known phenomenon in crime analysis assuming t...

Please sign up or login with your details

Forgot password? Click here to reset