Impact of Sampling on Locally Differentially Private Data Collection

06/02/2022
by   Sayan Biswas, et al.
0

With the recent bloom of data, there is a huge surge in threats against individuals' private information. Various techniques for optimizing privacy-preserving data analysis are at the focus of research in the recent years. In this paper, we analyse the impact of sampling on the utility of the standard techniques of frequency estimation, which is at the core of large-scale data analysis, of the locally deferentially private data-release under a pure protocol. We study the case in a distributed environment of data sharing where the values are reported by various nodes to the central server, e.g., cross-device Federated Learning. We show that if we introduce some random sampling of the nodes in order to reduce the cost of communication, the standard existing estimators fail to remain unbiased. We propose a new unbiased estimator in the context of sampling each node with certain probability and compute various statistical summaries of the data using it. We propose a way of sampling each node with personalized sampling probabilities as a step to further generalisation, which leads to some interesting open questions in the end. We analyse the accuracy of our proposed estimators on synthetic datasets to gather some insight on the trade-off between communication cost, privacy, and utility.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/15/2023

On the Utility Gain of Iterative Bayesian Update for Locally Differentially Private Mechanisms

This paper investigates the utility gain of using Iterative Bayesian Upd...
research
11/10/2020

Compression Boosts Differentially Private Federated Learning

Federated Learning allows distributed entities to train a common model c...
research
06/20/2022

SoteriaFL: A Unified Framework for Private Federated Learning with Communication Compression

To enable large-scale machine learning in bandwidth-hungry environments ...
research
05/20/2023

Can Public Large Language Models Help Private Cross-device Federated Learning?

We study (differentially) private federated learning (FL) of language mo...
research
11/03/2022

Single SMPC Invocation DPHelmet: Differentially Private Distributed Learning on a Large Scale

Distributing machine learning predictors enables the collection of large...
research
02/24/2021

Lossless Compression of Efficient Private Local Randomizers

Locally Differentially Private (LDP) Reports are commonly used for colle...
research
05/24/2023

Private and Collaborative Kaplan-Meier Estimators

Kaplan-Meier estimators capture the survival behavior of a cohort. They ...

Please sign up or login with your details

Forgot password? Click here to reset