ProbGraph: High-Performance and High-Accuracy Graph Mining with Probabilistic Set Representations

08/24/2022
by   Maciej Besta, et al.
0

Important graph mining problems such as Clustering are computationally demanding. To significantly accelerate these problems, we propose ProbGraph: a graph representation that enables simple and fast approximate parallel graph mining with strong theoretical guarantees on work, depth, and result accuracy. The key idea is to represent sets of vertices using probabilistic set representations such as Bloom filters. These representations are much faster to process than the original vertex sets thanks to vectorizability and small size. We use these representations as building blocks in important parallel graph mining algorithms such as Clique Counting or Clustering. When enhanced with ProbGraph, these algorithms significantly outperform tuned parallel exact baselines (up to nearly 50x on 32 cores) while ensuring accuracy of more than 90 probabilistic set representations with desirable statistical properties are of separate interest for the data analytics community.

READ FULL TEXT

page 1

page 3

page 5

page 14

research
03/05/2021

GraphMineSuite: Enabling High-Performance and Programmable Graph Mining Algorithms with Set Algebra

We propose GraphMineSuite (GMS): the first benchmarking suite for graph ...
research
08/26/2020

High-Performance Parallel Graph Coloring with Strong Guarantees on Work, Depth, and Quality

We develop the first parallel graph coloring heuristics with strong theo...
research
04/15/2021

SISA: Set-Centric Instruction Set Architecture for Graph Mining on Processing-in-Memory Systems

Simple graph algorithms such as PageRank have been the target of numerou...
research
12/21/2021

Accelerating Clique Counting in Sparse Real-World Graphs via Communication-Reducing Optimizations

Counting instances of specific subgraphs in a larger graph is an importa...
research
08/22/2022

Deterministic Graph-Walking Program Mining

Owing to their versatility, graph structures admit representations of in...
research
12/18/2019

Slim Graph: Practical Lossy Graph Compression for Approximate Graph Processing, Storage, and Analytics

We propose Slim Graph: the first programming model and framework for pra...
research
09/14/2018

Graph Pattern Mining and Learning through User-defined Relations (Extended Version)

In this work we propose R-GPM, a parallel computing framework for graph ...

Please sign up or login with your details

Forgot password? Click here to reset