Peregrine: A Pattern-Aware Graph Mining System

by   Kasra Jamshidi, et al.

Graph mining workloads aim to extract structural properties of a graph by exploring its subgraph structures. General purpose graph mining systems provide a generic runtime to explore subgraph structures of interest with the help of user-defined functions that guide the overall exploration process. However, the state-of-the-art graph mining systems remain largely oblivious to the shape (or pattern) of the subgraphs that they mine. This causes them to: (a) explore unnecessary subgraphs; (b) perform expensive computations on the explored subgraphs; and, (c) hold intermediate partial subgraphs in memory; all of which affect their overall performance. Furthermore, their programming models are often tied to their underlying exploration strategies, which makes it difficult for domain users to express complex mining tasks. In this paper, we develop Peregrine, a pattern-aware graph mining system that directly explores the subgraphs of interest while avoiding exploration of unnecessary subgraphs, and simultaneously bypassing expensive computations throughout the mining process. We design a pattern-based programming model that treats "graph patterns" as first class constructs and enables Peregrine to extract the semantics of patterns, which it uses to guide its exploration. Our evaluation shows that Peregrine outperforms state-of-the-art distributed and single machine graph mining systems, and scales to complex mining tasks on larger graphs, while retaining simplicity and expressivity with its "pattern-first" programming approach.


page 1

page 2

page 3

page 4


Pattern Morphing for Efficient Graph Mining

Graph mining applications analyze the structural properties of large gra...

Kaleido: An Efficient Out-of-core Graph Mining System on A Single Machine

Graph mining is one of the most important categories of graph algorithms...

Efficient Strategies for Graph Pattern Mining Algorithms on GPUs

Graph Pattern Mining (GPM) is an important, rapidly evolving, and comput...

Kudu: An Efficient and Scalable Distributed Graph Pattern Mining Engine

This paper proposes Kudu, a general distributed execution engine with a ...

cgSpan: Closed Graph-Based Substructure Pattern Mining

gSpan is a popular algorithm for mining frequent subgraphs. cgSpan (clos...

Visual Graph Mining

In this study, we formulate the concept of "mining maximal-size frequent...

DwarvesGraph: A High-Performance Graph Mining System with Pattern Decomposition

Graph mining tasks, which focus on extracting structural information fro...

Please sign up or login with your details

Forgot password? Click here to reset