Log In Sign Up

Robust and Differentially Private Mean Estimation

by   Xiyang Liu, et al.

Differential privacy has emerged as a standard requirement in a variety of applications ranging from the U.S. Census to data collected in commercial devices, initiating an extensive line of research in accurately and privately releasing statistics of a database. An increasing number of such databases consist of data from multiple sources, not all of which can be trusted. This leaves existing private analyses vulnerable to attacks by an adversary who injects corrupted data. Despite the significance of designing algorithms that guarantee privacy and robustness (to a fraction of data being corrupted) simultaneously, even the simplest questions remain open. For the canonical problem of estimating the mean from i.i.d. samples, we introduce the first efficient algorithm that achieves both privacy and robustness for a wide range of distributions. This achieves optimal accuracy matching the known lower bounds for robustness, but the sample complexity has a factor of d^1/2 gap from known lower bounds. We further show that this gap is due to the computational efficiency; we introduce the first family of algorithms that close this gap but takes exponential time. The innovation is in exploiting resilience (a key property in robust estimation) to adaptively bound the sensitivity and improve privacy.


page 1

page 2

page 3

page 4


Private Mean Estimation of Heavy-Tailed Distributions

We give new upper and lower bounds on the minimax sample complexity of d...

Privacy Induces Robustness: Information-Computation Gaps and Sparse Mean Estimation

We establish a simple connection between robust and differentially-priva...

Privacy-preserving parametric inference: a case for robust statistics

Differential privacy is a cryptographically-motivated approach to privac...

Locally Private Mean Estimation: Z-test and Tight Confidence Intervals

This work provides tight upper- and lower-bounds for the problem of mean...

Locally Differentially Private Analysis of Graph Statistics

Differentially private analysis of graphs is widely used for releasing s...

Differential privacy and robust statistics in high dimensions

We introduce a universal framework for characterizing the statistical ef...

Robust Learning from Untrusted Sources

Modern machine learning methods often require more data for training tha...