HyperLogLogLog: Cardinality Estimation With One Log More

05/23/2022
by   Matti Karppa, et al.
0

We present HyperLogLogLog, a practical compression of the HyperLogLog sketch that compresses the sketch from O(mloglog n) bits down to m log_2log_2log_2 m + O(m+loglog n) bits for estimating the number of distinct elements n using m registers. The algorithm works as a drop-in replacement that preserves all estimation properties of the HyperLogLog sketch, it is possible to convert back and forth between the compressed and uncompressed representations, and the compressed sketch maintains mergeability in the compressed domain. The compressed sketch can be updated in amortized constant time, assuming n is sufficiently larger than m. We provide a C++ implementation of the sketch, and show by experimental evaluation against well-known implementations by Google and Apache that our implementation provides small sketches while maintaining competitive update and merge times. Concretely, we observed approximately a 40 Furthermore, we obtain as a corollary a theoretical algorithm that compresses the sketch down to mlog_2log_2log_2log_2 m+O(mlogloglog m/loglog m+loglog n) bits.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/16/2020

Information Theoretic Limits of Cardinality Estimation: Fisher Meets Shannon

In this paper we study the intrinsic tradeoff between the space complexi...
research
08/20/2020

Simple and Efficient Cardinality Estimation in Data Streams

We study sketching schemes for the cardinality estimation problem in dat...
research
11/15/2018

Sketch based Reduced Memory Hough Transform

This paper proposes using sketch algorithms to represent the votes in Ho...
research
05/22/2019

A Memory-Efficient Sketch Method for Estimating High Similarities in Streaming Sets

Estimating set similarity and detecting highly similar sets are fundamen...
research
10/23/2017

HyperMinHash: Jaccard index sketching in LogLog space

In this extended abstract, we describe and analyse a streaming probabili...
research
09/12/2018

Bidirectional Evaluation with Direct Manipulation

We present an evaluation update (or simply, update) algorithm for a full...
research
03/06/2018

Moment-Based Quantile Sketches for Efficient High Cardinality Aggregation Queries

Interactive analytics increasingly involves querying for quantiles over ...

Please sign up or login with your details

Forgot password? Click here to reset