WRS: Waiting Room Sampling for Accurate Triangle Counting in Real Graph Streams

09/10/2017
by   Kijung Shin, et al.
0

If we cannot store all edges in a graph stream, which edges should we store to estimate the triangle count accurately? Counting triangles (i.e., cycles of length three) is a fundamental graph problem with many applications in social network analysis, web mining, anomaly detection, etc. Recently, much effort has been made to accurately estimate global and local triangle counts in streaming settings with limited space. Although existing methods use sampling techniques without considering temporal dependencies in edges, we observe temporal locality in real dynamic graphs. That is, future edges are more likely to form triangles with recent edges than with older edges. In this work, we propose a single-pass streaming algorithm called Waiting-Room Sampling (WRS) for global and local triangle counting. WRS exploits the temporal locality by always storing the most recent edges, which future edges are more likely to form triangles with, in the waiting room, while it uses reservoir sampling for the remaining edges. Our theoretical and empirical analyses show that WRS is: (a) Fast and 'any time': runs in linear time, always maintaining and updating estimates while new edges arrive, (b) Effective: yields up to 47 and (c) Theoretically sound: gives unbiased estimates with small variances under the temporal locality.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/12/2018

DiSLR: Distributed Sampling with Limited Redundancy For Triangle Counting in Graph Streams

Given a web-scale graph that grows over time, how should its edges be st...
research
06/05/2021

Faster and Generalized Temporal Triangle Counting, via Degeneracy Ordering

Triangle counting is a fundamental technique in network analysis, that h...
research
11/22/2018

REPT: A Streaming Algorithm of Approximating Global and Local Triangle Counts in Parallel

Recently, considerable efforts have been devoted to approximately comput...
research
06/22/2020

How to Count Triangles, without Seeing the Whole Graph

Triangle counting is a fundamental problem in the analysis of large grap...
research
11/13/2022

Reinforcement Learning Enhanced Weighted Sampling for Accurate Subgraph Counting on Fully Dynamic Graph Streams

As the popularity of graph data increases, there is a growing need to co...
research
08/19/2021

odeN: Simultaneous Approximation of Multiple Motif Counts in Large Temporal Networks

Counting the number of occurrences of small connected subgraphs, called ...
research
07/26/2021

TriPoll: Computing Surveys of Triangles in Massive-Scale Temporal Graphs with Metadata

Understanding the higher-order interactions within network data is a key...

Please sign up or login with your details

Forgot password? Click here to reset