The Sparse Vector Technique, Revisited

by   Haim Kaplan, et al.

We revisit one of the most basic and widely applicable techniques in the literature of differential privacy - the sparse vector technique [Dwork et al., STOC 2009]. Loosely speaking, this technique allows us to privately test whether the value of a given query is close to what we expect it would be (w.r.t. the input database), where we are allowed to test an unbounded number of queries as long as their value is indeed close to what we expected. After the first time in which this is not the case, the process halts. We present a modification to the sparse vector technique that allows for a more fine-tuned privacy analysis. As a result, in some cases we are able to continue with the process of testing queries even after the first time in which the value of the query did not meet our expectations. We demonstrate our technique by applying it to the shifting-heavy-hitters problem: On every time step, each of n users gets a new input, and the task is to privately identify all the current heavy-hitters. That is, on time step i, the goal is to identify all data elements x such that many of the users have x as their current input. We present an algorithm for this problem with improved error guarantees over what can be obtained using existing techniques. Specifically, the error of our algorithm depends on the maximal number of times that a singe user holds a heavy-hitter as input, rather than the total number of times in which a heavy-hitter exists.


page 1

page 2

page 3

page 4


Private Heavy Hitters and Range Queries in the Shuffled Model

An exciting new development in differential privacy is the shuffled mode...

Local Differential Privacy for Evolving Data

There are now several large scale deployments of differential privacy us...

Linear Queries Estimation with Local Differential Privacy

We study the problem of estimating a set of d linear queries with respec...

Multistage Knapsack

Many systems have to be maintained while the underlying constraints, cos...

An Optimal Algorithm for Online Unconstrained Submodular Maximization

We consider a basic problem at the interface of two fundamental fields: ...

Heavy Hitters over Interval Queries

Heavy hitters and frequency measurements are fundamental in many network...

Calibrating Noise to Variance in Adaptive Data Analysis

Datasets are often used multiple times and each successive analysis may ...