Triangle and Four Cycle Counting with Predictions in Graph Streams

03/17/2022
by   Justin Y. Chen, et al.
2

We propose data-driven one-pass streaming algorithms for estimating the number of triangles and four cycles, two fundamental problems in graph analytics that are widely studied in the graph data stream literature. Recently, (Hsu 2018) and (Jiang 2020) applied machine learning techniques in other data stream problems, using a trained oracle that can predict certain properties of the stream elements to improve on prior "classical" algorithms that did not use oracles. In this paper, we explore the power of a "heavy edge" oracle in multiple graph edge streaming models. In the adjacency list model, we present a one-pass triangle counting algorithm improving upon the previous space upper bounds without such an oracle. In the arbitrary order model, we present algorithms for both triangle and four cycle estimation with fewer passes and the same space complexity as in previous algorithms, and we show several of these bounds are optimal. We analyze our algorithms under several noise models, showing that the algorithms perform well even when the oracle errs. Our methodology expands upon prior work on "classical" streaming algorithms, as previous multi-pass and random order streaming algorithms can be seen as special cases of our algorithms, where the first pass or random order was used to implement the heavy edge oracle. Lastly, our experiments demonstrate advantages of the proposed method compared to state-of-the-art streaming algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/08/2021

A Quantum Advantage for a Natural Streaming Problem

Data streaming, in which a large dataset is received as a "stream" of up...
research
07/27/2020

Improved 3-pass Algorithm for Counting 4-cycles in Arbitrary Order Streaming

The problem of counting small subgraphs, and specifically cycles, in the...
research
03/27/2022

Approximately Counting Subgraphs in Data Streams

Estimating the number of subgraphs in data streams is a fundamental prob...
research
07/06/2020

Streaming Verification for Graph Problems: Optimal Tradeoffs and Nonlinear Sketches

We study graph computations in an enhanced data streaming setting, where...
research
05/03/2021

Model Counting meets F0 Estimation

Constraint satisfaction problems (CSP's) and data stream models are two ...
research
01/29/2018

ONCE and ONCE+: Counting the Frequency of Time-constrained Serial Episodes in a Streaming Sequence

As a representative sequential pattern mining problem, counting the freq...
research
12/21/2021

Counting Simplices in Hypergraph Streams

We consider the problem of space-efficiently estimating the number of si...

Please sign up or login with your details

Forgot password? Click here to reset