An Efficient System for Subgraph Discovery

07/24/2018
by   Aparna Joshi, et al.
0

Subgraph discovery in a single data graph---finding subsets of vertices and edges satisfying a user-specified criteria---is an essential and general graph analytics operation with a wide spectrum of applications. Depending on the criteria, subgraphs of interest may correspond to cliques of friends in social networks, interconnected entities in RDF data, or frequent patterns in protein interaction networks to name a few. Existing systems usually examine a large number of subgraphs while employing many computers and often produce an enormous result set of subgraphs. How can we enable fast discovery of only the most relevant subgraphs while minimizing the computational requirements? We present Nuri, a general subgraph discovery system that allows users to succinctly specify subgraphs of interest and criteria for ranking them. Given such specifications, Nuri efficiently finds the k most relevant subgraphs using only a single computer. It prioritizes (i.e., expands earlier than others) subgraphs that are more likely to expand into the desired subgraphs (prioritized subgraph expansion) and proactively discards irrelevant subgraphs from which the desired subgraphs cannot be constructed (pruning). Nuri can also efficiently store and retrieve a large number of subgraphs on disk without being limited by the size of main memory. We demonstrate using both real and synthetic datasets that Nuri on a single core outperforms the closest alternative distributed system consuming 40 times more computational resources by more than 2 orders of magnitude for clique discovery and 1 order of magnitude for subgraph isomorphism and pattern mining.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/02/2019

Efficient Algorithms for Densest Subgraph Discovery

Densest subgraph discovery (DSD) is a fundamental problem in graph minin...
research
06/11/2023

GuP: Fast Subgraph Matching by Guard-based Pruning

Subgraph matching, which finds subgraphs isomorphic to a query, is the k...
research
12/17/2022

Most Probable Densest Subgraphs

Computing the densest subgraph is a primitive graph operation with criti...
research
05/23/2019

Kaleido: An Efficient Out-of-core Graph Mining System on A Single Machine

Graph mining is one of the most important categories of graph algorithms...
research
05/18/2017

Taming Near Repeat Calculation for Crime Analysis via Cohesive Subgraph Computing

Near repeat (NR) is a well known phenomenon in crime analysis assuming t...
research
09/12/2020

Discovering Interesting Subgraphs in Social Media Networks

Social media data are often modeled as heterogeneous graphs with multipl...
research
04/25/2018

Symmetric Bilinear Regression for Signal Subgraph Estimation

There is increasing interest in learning a set of small outcome-relevant...

Please sign up or login with your details

Forgot password? Click here to reset