Revisiting Frequency Moment Estimation in Random Order Streams

03/06/2018
by   Vladimir Braverman, et al.
0

We revisit one of the classic problems in the data stream literature, namely, that of estimating the frequency moments F_p for 0 < p < 2 of an underlying n-dimensional vector presented as a sequence of additive updates in a stream. It is well-known that using p-stable distributions one can approximate any of these moments up to a multiplicative (1+ϵ)-factor using O(ϵ^-2 n) bits of space, and this space bound is optimal up to a constant factor in the turnstile streaming model. We show that surprisingly, if one instead considers the popular random-order model of insertion-only streams, in which the updates to the underlying vector arrive in a random order, then one can beat this space bound and achieve Õ(ϵ^-2 + n) bits of space, where the Õ hides poly((1/ϵ) + n) factors. If ϵ^-2≈ n, this represents a roughly quadratic improvement in the space achievable in turnstile streams. Our algorithm is in fact deterministic, and we show our space bound is optimal up to poly((1/ϵ) + n) factors for deterministic algorithms in the random order model. We also obtain a similar improvement in space for p = 2 whenever F_2 ≳ n· F_1.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/08/2021

Separations for Estimating Large Frequency Moments on Data Streams

We study the classical problem of moment estimation of an underlying vec...
research
07/12/2019

Towards Optimal Moment Estimation in Streaming and Distributed Models

One of the oldest problems in the data stream model is to approximate th...
research
03/23/2018

Data Streams with Bounded Deletions

Two prevalent models in the data stream literature are the insertion-onl...
research
08/31/2017

Sketching the order of events

We introduce features for massive data streams. These stream features ca...
research
08/26/2021

Truly Perfect Samplers for Data Streams and Sliding Windows

In the G-sampling problem, the goal is to output an index i of a vector ...
research
11/13/2017

Estimating Graph Parameters from Random Order Streams

We develop a new algorithmic technique that allows to transfer some cons...
research
04/13/2023

Pseudorandom Hashing for Space-bounded Computation with Applications in Streaming

We revisit Nisan's classical pseudorandom generator (PRG) for space-boun...

Please sign up or login with your details

Forgot password? Click here to reset