Estimating Graphlet Statistics via Lifting

02/23/2018
by   Kirill Paramonov, et al.
0

Exploratory analysis over network data is often limited by our ability to efficiently calculate graph statistics, which can provide a model-free understanding of macroscopic properties of a network. This work introduces a framework for estimating the graphlet count - the number of occurrences of a small subgraph motif (e.g. a wedge or a triangle) in the network. For massive graphs, where accessing the whole graph is not possible, the only viable algorithms are those which act locally by making a limited number of vertex neighborhood queries. We introduce a Monte Carlo sampling technique for graphlet counts, called lifting, which can simultaneously sample all graphlets of size up to k vertices. We outline three variants of lifted graphlet counts: the ordered, unordered, and shotgun estimators. We prove that our graphlet count updates are unbiased for the true graphlet count, have low correlation between samples, and have a controlled variance. We compare the experimental performance of lifted graphlet counts to the state-of-the art graphlet sampling procedures: Waddling and the pairwise subgraph random walk.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/28/2020

Central limit theorems for local network statistics

Subgraph counts - in particular the number of occurrences of small shape...
research
11/13/2022

Reinforcement Learning Enhanced Weighted Sampling for Accurate Subgraph Counting on Fully Dynamic Graph Streams

As the popularity of graph data increases, there is a growing need to co...
research
06/22/2020

How to Count Triangles, without Seeing the Whole Graph

Triangle counting is a fundamental problem in the analysis of large grap...
research
02/21/2018

Counting Motifs with Graph Sampling

Applied researchers often construct a network from a random sample of no...
research
04/08/2020

DegreeSketch: Distributed Cardinality Sketches on Massive Graphs with Applications

We present DegreeSketch, a semi-streaming distributed sketch data struct...
research
06/29/2020

Higher-order fluctuations in dense random graph models

Our main results are quantitative bounds in the multivariate normal appr...
research
01/06/2017

Estimation of Graphlet Statistics

Graphlets are induced subgraphs of a large network and are important for...

Please sign up or login with your details

Forgot password? Click here to reset