Nearly Optimal Distinct Elements and Heavy Hitters on Sliding Windows

05/01/2018
by   Vladimir Braverman, et al.
0

We study the distinct elements and ℓ_p-heavy hitters problems in the sliding window model, where only the most recent n elements in the data stream form the underlying set. We first introduce the composable histogram, a simple twist on the exponential (Datar et al., SODA 2002) and smooth histograms (Braverman and Ostrovsky, FOCS 2007) that may be of independent interest. We then show that the composable histogram along with a careful combination of existing techniques to track either the identity or frequency of a few specific items suffices to obtain algorithms for both distinct elements and ℓ_p-heavy hitters that are nearly optimal in both n and ϵ. Applying our new composable histogram framework, we provide an algorithm that outputs a (1+ϵ)-approximation to the number of distinct elements in the sliding window model and uses Ø1/ϵ^2 n1/ϵ n+1/ϵ^2 n bits of space. For ℓ_p-heavy hitters, we provide an algorithm using space O(1/ϵ^p^2 n( n+1/ϵ)) for 0<p< 2, improving upon the best-known algorithm for ℓ_2-heavy hitters (Braverman et al., COCOON 2014), which has space complexity O(1/ϵ^4^3 n). We also show complementing nearly optimal lower bounds of Ω(1/ϵ^2 n+1/ϵ^2 n) for distinct elements and Ω(1/ϵ^p^2 n) for ℓ_p-heavy hitters, both tight up to O( n) and O(1/ϵ) factors.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/15/2020

Tight Bounds for Adversarially Robust Streams and Sliding Windows via Difference Estimators

We introduce difference estimators for data stream computation, which pr...
research
12/07/2019

Flattened Exponential Histogram for Sliding Window Queries over Data Streams

The Basic Counting problem [1] is one of the most fundamental and critic...
research
02/22/2023

Differentially Private L_2-Heavy Hitters in the Sliding Window Model

The data management of large companies often prioritize more recent data...
research
04/05/2018

Optimal streaming and tracking distinct elements with high probability

The distinct elements problem is one of the fundamental problems in stre...
research
01/09/2014

Brazilian License Plate Detection Using Histogram of Oriented Gradients and Sliding Windows

Due to the increasingly need for automatic traffic monitoring, vehicle l...
research
10/05/2018

Memento: Making Sliding Windows Efficient for Heavy Hitters

Cloud operators require real-time identification of Heavy Hitters (HH) a...
research
10/29/2018

Distinct Sampling on Streaming Data with Near-Duplicates

In this paper we study how to perform distinct sampling in the streaming...

Please sign up or login with your details

Forgot password? Click here to reset