REPT: A Streaming Algorithm of Approximating Global and Local Triangle Counts in Parallel

11/22/2018
by   Pinghui Wang, et al.
0

Recently, considerable efforts have been devoted to approximately computing the global and local (i.e., incident to each node) triangle counts of a large graph stream represented as a sequence of edges. Existing approximate triangle counting algorithms rely on sampling techniques to reduce the computational cost. However, their estimation errors are significantly determined by the covariance between sampled triangles. Moreover, little attention has been paid to developing parallel one-pass streaming algorithms that can be used to fast and approximately count triangles on a multi-core machine or a cluster of machines. To solve these problems, we develop a novel parallel method REPT to significantly reduce the covariance (even completely eliminate the covariance for some cases) between sampled triangles. We theoretically prove that REPT is more accurate than parallelizing existing triangle count estimation algorithms in a direct manner. In addition, we also conduct extensive experiments on a variety of real-world graphs, and the results demonstrate that our method REPT is several times more accurate than state-of-the-art methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/12/2018

DiSLR: Distributed Sampling with Limited Redundancy For Triangle Counting in Graph Streams

Given a web-scale graph that grows over time, how should its edges be st...
research
10/07/2018

Graphlet Count Estimation via Convolutional Neural Networks

Graphlets are defined as k-node connected induced subgraph patterns. For...
research
09/10/2017

WRS: Waiting Room Sampling for Accurate Triangle Counting in Real Graph Streams

If we cannot store all edges in a graph stream, which edges should we st...
research
10/01/2022

A Novel Parallel Triangle Counting Algorithm with Reduced Communication

Counting and finding triangles in graphs is often used in real-world ana...
research
03/29/2020

How the Degeneracy Helps for Triangle Counting in Graph Streams

We revisit the well-studied problem of triangle count estimation in grap...
research
04/08/2020

DegreeSketch: Distributed Cardinality Sketches on Massive Graphs with Applications

We present DegreeSketch, a semi-streaming distributed sketch data struct...
research
01/06/2017

Estimation of Graphlet Statistics

Graphlets are induced subgraphs of a large network and are important for...

Please sign up or login with your details

Forgot password? Click here to reset