Reinforcement Learning Enhanced Weighted Sampling for Accurate Subgraph Counting on Fully Dynamic Graph Streams

11/13/2022
by   Kaixin Wang, et al.
0

As the popularity of graph data increases, there is a growing need to count the occurrences of subgraph patterns of interest, for a variety of applications. Many graphs are massive in scale and also fully dynamic (with insertions and deletions of edges), rendering exact computation of these counts to be infeasible. Common practice is, instead, to use a small set of edges as a sample to estimate the counts. Existing sampling algorithms for fully dynamic graphs sample the edges with uniform probability. In this paper, we show that we can do much better if we sample edges based on their individual properties. Specifically, we propose a weighted sampling algorithm called WSD for estimating the subgraph count in a fully dynamic graph stream, which samples the edges based on their weights that indicate their importance and reflect their properties. We determine the weights of edges in a data-driven fashion, using a novel method based on reinforcement learning. We conduct extensive experiments to verify that our technique can produce estimates with smaller errors while often running faster compared with existing algorithms.

READ FULL TEXT
research
02/23/2018

Estimating Graphlet Statistics via Lifting

Exploratory analysis over network data is often limited by our ability t...
research
10/01/2018

A sampling framework for counting temporal motifs

Pattern counting in graphs is fundamental to network science tasks, and ...
research
02/12/2018

DiSLR: Distributed Sampling with Limited Redundancy For Triangle Counting in Graph Streams

Given a web-scale graph that grows over time, how should its edges be st...
research
09/10/2017

WRS: Waiting Room Sampling for Accurate Triangle Counting in Real Graph Streams

If we cannot store all edges in a graph stream, which edges should we st...
research
06/04/2019

Motivo: fast motif counting via succinct color coding and adaptive sampling

The randomized technique of color coding is behind state-of-the-art algo...
research
12/22/2017

Estimating Node Similarity by Sampling Streaming Bipartite Graphs

Bipartite graph data increasingly occurs as a stream of edges that repre...
research
09/04/2020

Faster motif counting via succinct color coding and adaptive sampling

We address the problem of computing the distribution of induced connecte...

Please sign up or login with your details

Forgot password? Click here to reset