Cardinality Estimators do not Preserve Privacy

08/17/2018
by   Damien Desfontaines, et al.
0

Cardinality estimators like HyperLogLog are sketching algorithms that estimate the number of distinct elements in a large multiset. Their use in privacy-sensitive contexts raises the question of whether they leak private information. In particular, can they provide any privacy guarantees while preserving their strong aggregation properties? We formulate an abstract notion of cardinality estimators, which captures this aggregation requirement: one can merge sketches without losing precision. We propose an attacker model and a corresponding privacy definition, strictly weaker than differential privacy: we assume that the attacker has no prior knowledge of the data. We then show that if a cardinality estimator satisfies this definition, then it cannot have a reasonable level of accuracy. We prove similar results for weaker versions of our definition, and analyze the privacy of existing algorithms, showing that their average privacy loss is significant, even for multisets with large cardinalities. We conclude that sketches from cardinality estimators should be considered as sensitive as raw data, and propose risk mitigation strategies for their real-world applications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/29/2022

(Nearly) All Cardinality Estimators Are Differentially Private

We consider privacy in the context of streaming algorithms for cardinali...
research
05/02/2019

Passive and active attackers in noiseless privacy

Differential privacy offers clear and strong quantitative guarantees for...
research
11/20/2020

HyperLogLog (HLL) Security: Inflating Cardinality Estimates

Counting the number of distinct elements on a set is needed in many appl...
research
06/25/2020

Identification and Formal Privacy Guarantees

Empirical economic research crucially relies on highly sensitive individ...
research
08/22/2022

Simpler and Better Cardinality Estimators for HyperLogLog and PCSA

Cardinality Estimation (aka Distinct Elements) is a classic problem in s...
research
02/04/2023

Sketch-Flip-Merge: Mergeable Sketches for Private Distinct Counting

Data sketching is a critical tool for distinct counting, enabling multis...
research
08/17/2020

Cardinality estimation using Gumbel distribution

Cardinality estimation is the task of approximating the number of distin...

Please sign up or login with your details

Forgot password? Click here to reset