Sandslash: A Two-Level Framework for Efficient Graph Pattern Mining

11/05/2020
by   Xuhao Chen, et al.
0

Graph pattern mining (GPM) is used in diverse application areas including social network analysis, bioinformatics, and chemical engineering. Existing GPM frameworks either provide high-level interfaces for productivity at the cost of expressiveness or provide low-level interfaces that can express a wide variety of GPM algorithms at the cost of increased programming complexity. Moreover, existing systems lack the flexibility to explore combinations of optimizations to achieve performance competitive with hand-optimized applications. We present Sandslash, an in-memory Graph Pattern Mining (GPM) framework that uses a novel programming interface to support productive, expressive, and efficient GPM on large graphs. Sandslash provides a high-level API that needs only a specification of the GPM problem, and it implements fast subgraph enumeration, provides efficient data structures, and applies high-level optimizations automatically. To achieve performance competitive with expert-optimized implementations, Sandslash also provides a low-level API that allows users to express algorithm-specific optimizations. This enables Sandslash to support both high-productivity and high-efficiency without losing expressiveness. We evaluate Sandslash on shared-memory machines using five GPM applications and a wide range of large real-world graphs. Experimental results demonstrate that applications written using Sandslash high-level or low-level API outperforms state-of-the-art GPM systems AutoMine, Pangolin, and Peregrine on average by 13.8x, 7.9x, and 5.4x, respectively. We also show that these Sandslash applications outperform expert-optimized GPM implementations by 2.3x on average with less programming effort.

READ FULL TEXT
research
11/16/2019

Pangolin: An Efficient and Flexible Graph Mining System on CPU and GPU

There is growing interest in graph mining algorithms such as motif count...
research
09/28/2022

Disruptive Changes in Field Equation Modeling: A Simple Interface for Wafer Scale Engines

We present a high-level and accessible Application Programming Interface...
research
06/20/2022

Mnemonic: A Parallel Subgraph Matching System for Streaming Graphs

Finding patterns in large highly connected datasets is critical for valu...
research
06/17/2023

PIMMiner: A High-performance PIM Architecture-aware Graph Mining Framework

Graph mining applications, such as subgraph pattern matching and mining,...
research
02/27/2019

Stateful Dataflow Multigraphs: A Data-Centric Model for High-Performance Parallel Programs

With the ubiquity of accelerators, such as FPGAs and GPUs, the complexit...
research
01/31/2018

Cataloging the Visible Universe through Bayesian Inference at Petascale

Astronomical catalogs derived from wide-field imaging surveys are an imp...
research
02/19/2021

Curvy: An Interactive Design Tool for Varying Density Support Structures

We introduce Curvy-an interactive design tool to generate varying densit...

Please sign up or login with your details

Forgot password? Click here to reset