Scalable Pattern Matching in Metadata Graphs via Constraint Checking

12/18/2019
by   Tahsin Reza, et al.
0

Pattern matching is a fundamental tool for answering complex graph queries. Unfortunately, existing solutions have limited capabilities: they do not scale to process large graphs and/or support only a restricted set of search templates or usage scenarios. We present an algorithmic pipeline that bases pattern matching on constraint checking. The key intuition is that each vertex or edge participating in a match has to meet a set of constrains implicitly specified by the search template. The pipeline we propose iterates over these constraints to eliminate all the vertices and edges that do not participate in any match and reduces the background graph to a subgraph which is the union of all matches. Additional analysis can be performed on this annotated, reduced graph, such as full match enumeration. Furthermore, a vertex-centric formulation for constraint checking algorithms exists, and this makes it possible to harness existing high-performance, vertex-centric graph processing frameworks. The key contribution of this work is a design following the constraint checking approach for exact matching and its experimental evaluation. We show that the proposed technique: (i) enables highly scalable pattern matching in labeled graphs, (ii) supports arbitrary patterns with 100% precision, (iii) always selects all vertices and edges that participate in matches, thus offering 100% recall, and (iv) supports a set of popular data analytics scenarios. We implement our approach on top of HavoqGT, an open-source asynchronous graph processing framework, and demonstrate its advantages through strong and weak scaling experiments on massive-scale real-world (up to 257 billion edges) and synthetic (up to 4.4 trillion edges) labeled graphs respectively, and at scales (1,024 nodes / 36,864 cores) orders of magnitude larger than used in the past for similar problems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/20/2019

Distributed Algorithms for Subgraph-Centric Graph Platforms

Graph analytics for large scale graphs has gained interest in recent yea...
research
06/23/2020

Distributed Subgraph Enumeration via Backtracking-based Framework

Finding or monitoring subgraph instances that are isomorphic to a given ...
research
10/01/2019

Retrieving Top Weighted Triangles in Graphs

Pattern counting in graphs is a fundamental primitive for many network a...
research
12/11/2018

DRONE: a Distributed Subgraph-Centric Framework for Processing Large Scale Power-law Graphs

Nowadays, in the big data era, social networks, graph databases, knowled...
research
01/06/2020

A Hybrid Approach to Temporal Pattern Matching

The primary objective of graph pattern matching is to find all appearanc...
research
09/23/2020

GraphPi: High Performance Graph Pattern Matching through Effective Redundancy Elimination

Graph pattern matching, which aims to discover structural patterns in gr...
research
06/03/2018

An Efficient Dispatcher for Large Scale GraphProcessing on OpenCL-based FPGAs

High parallel framework has been proved to be very suitable for graph pr...

Please sign up or login with your details

Forgot password? Click here to reset