Tight Bounds for Adversarially Robust Streams and Sliding Windows via Difference Estimators

11/15/2020
by   David P. Woodruff, et al.
0

We introduce difference estimators for data stream computation, which provide approximations to F(v)-F(u) for frequency vectors v≽ u and a given function F. We show how to use such estimators to carefully trade error for memory in an iterative manner. The function F is generally non-linear, and we give the first difference estimators for the frequency moments F_p for p∈[0,2], as well as for integers p>2. Using these, we resolve a number of central open questions in adversarial robust streaming and sliding window models. For adversarially robust streams, we obtain a (1+ϵ)-approximation to F_p using 𝒪̃(log n/ϵ^2) bits of space for p∈[0,2] and using 𝒪̃(1/ϵ^2n^1-2/p) bits of space for integers p>2. We also obtain an adversarially robust algorithm for the L_2-heavy hitters problem using 𝒪̃(log n/ϵ^2) bits of space. Our bounds are optimal up to poly(loglog n+log(1/ϵ)) factors, and improve the 1/ϵ^3 dependence of Ben-Eliezer et al. (PODS 2020, best paper award) and the 1/ϵ^2.5 dependence of Hassidim et al. (NeurIPS 2020, oral presentation). For sliding windows, we obtain a (1+ϵ)-approximation to F_p for p∈(0,2], resolving a longstanding question of Braverman and Ostrovsky (FOCS 2007). For example, for p = 2 we improve the dependence on ϵ from 1/ϵ^4 to an optimal 1/ϵ^2. For both models, our dependence on ϵ shows, up to log1/ϵ factors, that there is no overhead over the standard insertion-only data stream model for any of these problems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/01/2018

Nearly Optimal Distinct Elements and Heavy Hitters on Sliding Windows

We study the distinct elements and ℓ_p-heavy hitters problems in the sli...
research
09/20/2023

Testing frequency distributions in a stream

We study how to verify specific frequency distributions when we observe ...
research
08/26/2021

Truly Perfect Samplers for Data Streams and Sliding Windows

In the G-sampling problem, the goal is to output an index i of a vector ...
research
10/07/2020

New Verification Schemes for Frequency-Based Functions on Data Streams

We study the general problem of computing frequency-based functions, i.e...
research
05/14/2018

Copulas for Streaming Data

Empirical copula functions can be used to model the dependence structure...
research
04/13/2023

Pseudorandom Hashing for Space-bounded Computation with Applications in Streaming

We revisit Nisan's classical pseudorandom generator (PRG) for space-boun...
research
09/08/2021

Adversarially Robust Streaming via Dense–Sparse Trade-offs

A streaming algorithm is adversarially robust if it is guaranteed to per...

Please sign up or login with your details

Forgot password? Click here to reset