Cardinality Estimation in a Virtualized Network Device Using Online Machine Learning

03/13/2019
by   Reuven Cohen, et al.
0

Cardinality estimation algorithms receive a stream of elements, with possible repetitions, and return the number of distinct elements in the stream. Such algorithms seek to minimize the required memory and CPU resource consumption at the price of inaccuracy in their output. In computer networks, cardinality estimation algorithms are mainly used for counting the number of distinct flows, and they are divided into two categories: sketching algorithms and sampling algorithms. Sketching algorithms require the processing of all packets, and they are therefore usually implemented by dedicated hardware. Sampling algorithms do not require processing of all packets, but they are known for their inaccuracy. In this work we identify one of the major drawbacks of sampling-based cardinality estimation algorithms: their inability to adapt to changes in flow size distribution. To address this problem, we propose a new sampling-based adaptive cardinality estimation framework, which uses online machine learning. We evaluate our framework using real traffic traces, and show significantly better accuracy compared to the best known sampling-based algorithms, for the same fraction of processed packets.

READ FULL TEXT
research
06/11/2021

ExtendedHyperLogLog: Analysis of a new Cardinality Estimator

We discuss the problem of counting distinct elements in a stream. A stre...
research
11/20/2020

HyperLogLog (HLL) Security: Inflating Cardinality Estimates

Counting the number of distinct elements on a set is needed in many appl...
research
06/08/2023

Analysis of Knuth's Sampling Algorithm D and D'

In this research paper, we address the Distinct Elements estimation prob...
research
08/17/2020

Cardinality estimation using Gumbel distribution

Cardinality estimation is the task of approximating the number of distin...
research
03/29/2022

(Nearly) All Cardinality Estimators Are Differentially Private

We consider privacy in the context of streaming algorithms for cardinali...
research
07/03/2023

An embarrassingly parallel optimal-space cardinality estimation algorithm

In 2020 Blasiok (ACM Trans. Algorithms 16(2) 3:1-3:28) constructed an op...
research
11/30/2018

Per-Flow Cardinality Estimation Based On Virtual LogLog Sketching

Flow cardinality estimation is the problem of estimating the number of d...

Please sign up or login with your details

Forgot password? Click here to reset