Per-Flow Cardinality Estimation Based On Virtual LogLog Sketching

11/30/2018
by   Zeyu Zhou, et al.
0

Flow cardinality estimation is the problem of estimating the number of distinct elements in a data flow, often with a stringent memory constraint. It has wide applications in network traffic measurement and in database systems. The virtual LogLog algorithm proposed recently by Xiao, Chen, Chen and Ling estimates the cardinalities of a large number of flows with a compact memory. The purpose of this thesis is to explore two new perspectives on the estimation process of this algorithm. Firstly, we propose and investigate a family of estimators that generalizes the original vHLL estimator and evaluate the performance of the vHLL estimator compared to other estimators in this family. Secondly, we propose an alternative solution to the estimation problem by deriving a maximum-likelihood estimator. Empirical evidence from both perspectives suggests the near-optimality of the vHLL estimator for per-flow estimation, analogous to the near-optimality of the HLL estimator for single-flow estimation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/12/2019

A taxonomy of estimator consistency on discrete estimation problems

We describe a four-level hierarchy mapping both all discrete estimation ...
research
04/12/2017

Persistent Spread Measurement for Big Network Data Based on Register Intersection

Persistent spread measurement is to count the number of distinct element...
research
08/17/2020

Cardinality estimation using Gumbel distribution

Cardinality estimation is the task of approximating the number of distin...
research
02/06/2022

Learning to be a Statistician: Learned Estimator for Number of Distinct Values

Estimating the number of distinct values (NDV) in a column is useful for...
research
08/22/2022

Simpler and Better Cardinality Estimators for HyperLogLog and PCSA

Cardinality Estimation (aka Distinct Elements) is a classic problem in s...
research
03/13/2019

Cardinality Estimation in a Virtualized Network Device Using Online Machine Learning

Cardinality estimation algorithms receive a stream of elements, with pos...
research
10/28/2021

Improving Causal Effect Estimation of Weighted RegressionBased Estimator using Neural Networks

Estimating causal effects from observational data informs us about which...

Please sign up or login with your details

Forgot password? Click here to reset