Statistical Inference in the Differential Privacy Model

08/11/2021
by   Huanyu Zhang, et al.
0

In modern settings of data analysis, we may be running our algorithms on datasets that are sensitive in nature. However, classical machine learning and statistical algorithms were not designed with these risks in mind, and it has been demonstrated that they may reveal personal information. These concerns disincentivize individuals from providing their data, or even worse, encouraging intentionally providing fake data. To assuage these concerns, we import the constraint of differential privacy to the statistical inference, considered by many to be the gold standard of data privacy. This thesis aims to quantify the cost of ensuring differential privacy, i.e., understanding how much additional data is required to perform data analysis with the constraint of differential privacy. Despite the maturity of the literature on differential privacy, there is still inadequate understanding in some of the most fundamental settings. In particular, we make progress in the following problems: ∙ What is the sample complexity of DP hypothesis testing? ∙ Can we privately estimate distribution properties with a negligible cost? ∙ What is the fundamental limit in private distribution estimation? ∙ How can we design algorithms to privately estimate random graphs? ∙ What is the trade-off between the sample complexity and the interactivity in private hypothesis selection?

READ FULL TEXT
research
11/27/2018

The Structure of Optimal Private Tests for Simple Hypotheses

Hypothesis testing plays a central role in statistical inference, and is...
research
05/24/2019

Hypothesis Testing Interpretations and Renyi Differential Privacy

Differential privacy is the gold standard in data privacy, with applicat...
research
10/15/2021

Multivariate Mean Comparison under Differential Privacy

The comparison of multivariate population means is a central task of sta...
research
09/12/2023

Private Distribution Testing with Heterogeneous Constraints: Your Epsilon Might Not Be Mine

Private closeness testing asks to decide whether the underlying probabil...
research
03/22/2023

Stability is Stable: Connections between Replicability, Privacy, and Adaptive Generalization

The notion of replicable algorithms was introduced in Impagliazzo et al....
research
10/12/2021

Adjusting Queries to Statistical Procedures Under Differential Privacy

We consider a dataset S held by an agency, and a vector query of interes...
research
08/01/2020

Learning from Mixtures of Private and Public Populations

We initiate the study of a new model of supervised learning under privac...

Please sign up or login with your details

Forgot password? Click here to reset