Optimal streaming and tracking distinct elements with high probability

04/05/2018
by   Jarosław Błasiok, et al.
0

The distinct elements problem is one of the fundamental problems in streaming algorithms --- given a stream of integers in the range {1,...,n}, we wish to provide a (1+ε) approximation to the number of distinct elements in the input. After a long line of research optimal solution for this problem with constant probability of success, using O(1/ε^2+ n) bits of space, was given by Kane, Nelson and Woodruff in 2010. The standard approach used in order to achieve low failure probability δ, is to take a median of δ^-1 parallel repetitions of the original algorithm and report the median of computed answers. We show that such a multiplicative space blow-up is unnecessary: we provide an optimal algorithm using O(δ^-1/ε^2 + n) bits of space --- matching known lower bounds for this problem. That is, the δ^-1 factor does not multiply the n term. This settles completely the space complexity of the distinct elements problem with respect to all standard parameters. We consider also strong tracking (or continuous monitoring) variant of the distinct elements problem, where we want an algorithm which provides an approximation of the number of distinct elements seen so far, at all times of the stream. We show that this variant can be solved using O( n + δ^-1/ε^2 + n) bits of space, which we show to be optimal.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/17/2022

Streaming Algorithms with Large Approximation Factors

We initiate a broad study of classical problems in the streaming model w...
research
05/01/2018

Nearly Optimal Distinct Elements and Heavy Hitters on Sliding Windows

We study the distinct elements and ℓ_p-heavy hitters problems in the sli...
research
06/08/2023

Analysis of Knuth's Sampling Algorithm D and D'

In this research paper, we address the Distinct Elements estimation prob...
research
07/03/2023

An embarrassingly parallel optimal-space cardinality estimation algorithm

In 2020 Blasiok (ACM Trans. Algorithms 16(2) 3:1-3:28) constructed an op...
research
01/24/2023

Distinct Elements in Streams: An Algorithm for the (Text) Book

Given a data stream 𝒟 = ⟨ a_1, a_2, …, a_m ⟩ of m elements where each a_...
research
05/28/2018

High Probability Frequency Moment Sketches

We consider the problem of sketching the p-th frequency moment of a vect...
research
09/08/2021

Adversarially Robust Streaming via Dense–Sparse Trade-offs

A streaming algorithm is adversarially robust if it is guaranteed to per...

Please sign up or login with your details

Forgot password? Click here to reset