Streaming Frequent Items with Timestamps and Detecting Large Neighborhoods in Graph Streams

by   Christian Konrad, et al.

Detecting frequent items is a fundamental problem in data streaming research. However, in many applications, besides the frequent items themselves, meta data such as the timestamps of when the frequent items appeared or other application-specific data that "arrives" with the frequent items needs to be reported too. To this end, we introduce the Neighborhood Detection problem in graph streams, which both accurately models situations such as those stated above, and addresses the fundamental problem of detecting large neighborhoods or stars in graph streams. In Neighborhood Detection, an algorithm receives the edges of a bipartite graph G=(A, B, E) with |A| = n and |B| = poly n in arbitrary order and is given a threshold parameter d. Provided that there is at least one A-node of degree at least d, the objective is to output a node a ∈ A together with at least d/c of its neighbors, where c is the approximation factor. We show that in insertion-only streams, there is a one-pass Õ(n + n^1/cd) space c-approximation streaming algorithm, for integral values of c > 2. We complement this result with a lower bound, showing that computing a (c/1.01)-approximation requires space Ω(n / c^2 + n^1/c-1d / c^2), for any integral c > 2, which renders our algorithm optimal for a large range of settings (up to logarithmic factors). In insertion-deletion (turnstile) streams, we give a one-pass c-approximation algorithm with space Õ(dn/c^2) (if c <√(n)). We also prove that this is best possible up to logarithmic factors. Our lower bounds are obtained by defining new multi-party and two-party communication problems, respectively, and proving lower bounds on their communication complexities using information theoretic arguments.



There are no comments yet.


page 1

page 2

page 3

page 4


Optimal Lower Bounds for Matching and Vertex Cover in Dynamic Graph Streams

In this paper, we give simple optimal lower bounds on the one-way two-pa...

Two Player Hidden Pointer Chasing and Multi-Pass Lower Bounds in Turnstile Streams

The authors have withdrawn this paper due to an error in the proof of Le...

Data Streams with Bounded Deletions

Two prevalent models in the data stream literature are the insertion-onl...

SpaceSaving^±: An Optimal Algorithm for Frequency Estimation and Frequent items in the Bounded Deletion Model

In this paper, we propose the first deterministic algorithms to solve th...

Deterministic Graph Coloring in the Streaming Model

Recent breakthroughs in graph streaming have led to the design of single...

An Asymptotically Optimal Algorithm for Maximum Matching in Dynamic Streams

We present an algorithm for the maximum matching problem in dynamic (ins...

Mining frequent items in unstructured P2P networks

Large scale decentralized systems, such as P2P, sensor or IoT device net...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.