research
∙
08/31/2023
UltraLogLog: A Practical and More Space-Efficient Alternative to HyperLogLog for Approximate Distinct Counting
Since its invention HyperLogLog has become the standard algorithm for ap...
research
∙
07/16/2021
Estimation from Partially Sampled Distributed Traces
Sampling is often a necessary evil to reduce the processing and storage ...
research
∙
01/01/2021
SetSketch: Filling the Gap between MinHash and HyperLogLog
MinHash and HyperLogLog are sketching algorithms that have become indisp...
research
∙
11/02/2019
ProbMinHash – A Class of Locality-Sensitive Hash Algorithms for the (Probability) Jaccard Similarity
The probability Jaccard similarity was recently proposed as a natural ge...
research
∙
02/11/2019
Computing Extremely Accurate Quantiles Using t-Digests
We present on-line algorithms for computing approximations of rank-based...
research
∙
02/12/2018