A Simple Sublinear-Time Algorithm for Counting Arbitrary Subgraphs via Edge Sampling

11/19/2018
by   Sepehr Assadi, et al.
0

In the subgraph counting problem, we are given a input graph G(V, E) and a target graph H; the goal is to estimate the number of occurrences of H in G. Our focus here is on designing sublinear-time algorithms for approximately counting occurrences of H in G in the setting where the algorithm is given query access to G. This problem has been studied in several recent papers which primarily focused on specific families of graphs H such as triangles, cliques, and stars. However, not much is known about approximate counting of arbitrary graphs H. This is in sharp contrast to the closely related subgraph enumeration problem that has received significant attention in the database community as the database join problem. The AGM bound shows that the maximum number of occurrences of any arbitrary subgraph H in a graph G with m edges is O(m^ρ(H)), where ρ(H) is the fractional edge-cover of H, and enumeration algorithms with matching runtime are known for any H. We bridge this gap between subgraph counting and subgraph enumeration by designing a sublinear-time algorithm that can estimate the number of any arbitrary subgraph H in G, denoted by #H, to within a (1±ϵ)-approximation w.h.p. in O(m^ρ(H)/#H) · poly(n,1/ϵ) time. Our algorithm is allowed the standard set of queries for general graphs, namely degree queries, pair queries and neighbor queries, plus an additional edge-sample query that returns an edge chosen uniformly at random. The performance of our algorithm matches those of Eden et.al. [FOCS 2015, STOC 2018] for counting triangles and cliques and extend them to all choices of subgraph H under the additional assumption of edge-sample queries. We further show that our algorithm works for the more general database join size estimation problem and prove a matching lower bound for this problem.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/04/2020

Sampling Arbitrary Subgraphs Exactly Uniformly in Sublinear Time

We present a simple sublinear-time algorithm for sampling an arbitrary s...
research
09/16/2022

Asymptotically Optimal Bounds for Estimating H-Index in Sublinear Time with Applications to Subgraph Counting

The h-index is a metric used to measure the impact of a user in a public...
research
08/15/2018

The Sketching Complexity of Graph and Hypergraph Counting

Subgraph counting is a fundamental primitive in graph processing, with a...
research
08/26/2023

Spanning Adjacency Oracles in Sublinear Time

Suppose we are given an n-node, m-edge input graph G, and the goal is to...
research
02/13/2019

Counting Answers to Existential Questions

Conjunctive queries select and are expected to return certain tuples fro...
research
12/15/2022

TED: Towards Discovering Top-k Edge-Diversified Patterns in a Graph Database

With an exponentially growing number of graphs from disparate repositori...
research
07/14/2021

Towards a Decomposition-Optimal Algorithm for Counting and Sampling Arbitrary Motifs in Sublinear Time

We consider the problem of sampling and approximately counting an arbitr...

Please sign up or login with your details

Forgot password? Click here to reset