DeepAI
Log In Sign Up

Differentially Private Weighted Sampling

10/25/2020
by   Edith Cohen, et al.
4

Common datasets have the form of elements with keys (e.g., transactions and products) and the goal is to perform analytics on the aggregated form of key and frequency pairs. A weighted sample of keys by (a function of) frequency is a highly versatile summary that provides a sparse set of representative keys and supports approximate evaluations of query statistics. We propose private weighted sampling (PWS): A method that ensures element-level differential privacy while retaining, to the extent possible, the utility of a respective non-private weighted sample. PWS maximizes the reporting probabilities of keys and improves over the state of the art also for the well-studied special case of private histograms, when no sampling is performed. We empirically demonstrate significant performance gains compared with prior baselines: 20%-300% increase in key reporting for common Zipfian frequency distributions and accuracy for × 2-8 lower frequencies in estimation tasks. Moreover, PWS is applied as a simple post-processing of a non-private sample, without requiring the original data. This allows for seamless integration with existing implementations of non-private schemes and retaining the efficiency of schemes designed for resource-constrained settings such as massive distributed or streamed data. We believe that due to practicality and performance, PWS may become a method of choice in applications where privacy is desired.

READ FULL TEXT

page 1

page 2

page 3

page 4

07/04/2019

Sampling Sketches for Concave Sublinear Functions of Frequencies

We consider massive distributed datasets that consist of elements modele...
05/17/2022

Improved Utility Analysis of Private CountSketch

Sketching is an important tool for dealing with high-dimensional vectors...
11/28/2019

PCKV: Locally Differentially Private Correlated Key-Value Data Collection with Optimized Utility

Data collection under local differential privacy (LDP) has been mostly s...
09/28/2020

On the Round Complexity of the Shuffle Model

The shuffle model of differential privacy was proposed as a viable model...
07/24/2020

Controlling Privacy Loss in Survey Sampling (Working Paper)

Social science and economics research is often based on data collected i...
08/06/2018

Differential Private Stream Processing of Energy Consumption

A number of applications benefit from continuously releasing streams of ...