Differentially Private Stream Processing at Scale

03/31/2023
by   Bing Zhang, et al.
0

We design, to the best of our knowledge, the first differentially private (DP) stream processing system at scale. Our system –Differential Privacy SQL Pipelines (DP-SQLP)– is built using a streaming framework similar to Spark streaming, and is built on top of the Spanner database and the F1 query engine from Google. Towards designing DP-SQLP we make both algorithmic and systemic advances, namely, we (i) design a novel DP key selection algorithm that can operate on an unbounded set of possible keys, and can scale to one billion keys that users have contributed, (ii) design a preemptive execution scheme for DP key selection that avoids enumerating all the keys at each triggering time, and (iii) use algorithmic techniques from DP continual observation to release a continual DP histogram of user contributions to different keys over the stream length. We empirically demonstrate the efficacy by obtaining at least 16× reduction in error over meaningful baselines we consider.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/31/2021

Differentially Private Histograms under Continual Observation: Streaming Selection into the Unknown

We generalize the continuous observation privacy setting from Dwork et a...
research
12/20/2022

Continual Mean Estimation Under User-Level Privacy

We consider the problem of continually releasing an estimate of the popu...
research
07/14/2023

Differentially Private Clustering in Data Streams

The streaming model is an abstraction of computing over massive data str...
research
01/13/2023

Differentially Private Continual Releases of Streaming Frequency Moment Estimations

The streaming model of computation is a popular approach for working wit...
research
10/25/2020

Differentially Private Weighted Sampling

Common datasets have the form of elements with keys (e.g., transactions ...
research
05/24/2020

Continuous Release of Data Streams under both Centralized and Local Differential Privacy

In this paper, we study the problem of publishing a stream of real-value...
research
06/16/2022

On Private Online Convex Optimization: Optimal Algorithms in ℓ_p-Geometry and High Dimensional Contextual Bandits

Differentially private (DP) stochastic convex optimization (SCO) is ubiq...

Please sign up or login with your details

Forgot password? Click here to reset