Trapezoidal Sketch: A Sketch Structure for Frequency Estimation of Data Streams

12/16/2019
by   Ning Li, et al.
0

The sketch is one of the typical and widely used data structures for estimating the frequencies of items in data streams. However, since the counter sizes in the traditional rectangular-structure sketch (rstructure sketch) are the same, it is hard to achieve small space usage, high capacity (i.e., the maximum counter size of the sketch), and high estimation accuracy simultaneously. Moreover, when considering the high skewness of data streams, this will become even worse [15]. Therefore, in this paper, we firstly propose the trapezoidal-structure sketch (t-structure sketch). In the t-structure sketch, different from the r-structure sketch, the counter sizes in different layers are different. Based on this innovation, the low space usage and high capacity can be achieved simultaneously in the t-structure sketch. Then, based on the basic t-structure sketch, we propose the space saving t-structure sketch and the capacity improvement t-structure sketch, and analyze the properties of these two t-structure sketches. Finally, considering the estimation accuracy in the original version of the t-structure sketch is slightly worse than that in the r-structure sketch, we propose the probabilistic based estimation error reducing algorithm to improve the estimation accuracy in t-structure sketch. Compared with the CM sketch, CU sketch, C sketch, and A sketch, the simulation results show that the performances on space usage, capacity, and estimation accuracy are improved successfully by t-structure sketch.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/26/2021

Differentially Private Frequency Moments Estimation with Polylogarithmic Space

We prove that 𝔽_p sketch, a well-celebrated streaming algorithm for freq...
research
12/16/2019

A new Frequency Estimation Sketch for Data Streams

In data stream applications, one of the critical issues is to estimate t...
research
11/07/2017

Finding Heavily-Weighted Features in Data Streams

We introduce a new sub-linear space data structure---the Weight-Median S...
research
12/09/2022

DUNE: Improving Accuracy for Sketch-INT Network Measurement Systems

In-band Network Telemetry (INT) and sketching algorithms are two promisi...
research
12/10/2020

Sketchy With a Chance of Adoption: Can Sketch-Based Telemetry Be Ready for Prime Time?

Sketching algorithms or sketches have emerged as a promising alternative...
research
05/29/2019

Recursive Sketches for Modular Deep Learning

We present a mechanism to compute a sketch (succinct summary) of how a c...
research
09/03/2018

GB-KMV: An Augmented KMV Sketch for Approximate Containment Similarity Search

In this paper, we study the problem of approximate containment similarit...

Please sign up or login with your details

Forgot password? Click here to reset