Stability is Stable: Connections between Replicability, Privacy, and Adaptive Generalization

03/22/2023
by   Mark Bun, et al.
0

The notion of replicable algorithms was introduced in Impagliazzo et al. [STOC '22] to describe randomized algorithms that are stable under the resampling of their inputs. More precisely, a replicable algorithm gives the same output with high probability when its randomness is fixed and it is run on a new i.i.d. sample drawn from the same distribution. Using replicable algorithms for data analysis can facilitate the verification of published results by ensuring that the results of an analysis will be the same with high probability, even when that analysis is performed on a new data set. In this work, we establish new connections and separations between replicability and standard notions of algorithmic stability. In particular, we give sample-efficient algorithmic reductions between perfect generalization, approximate differential privacy, and replicability for a broad class of statistical problems. Conversely, we show any such equivalence must break down computationally: there exist statistical problems that are easy under differential privacy, but that cannot be solved replicably without breaking public-key cryptography. Furthermore, these results are tight: our reductions are statistically optimal, and we show that any computational separation between DP and replicability must imply the existence of one-way functions. Our statistical reductions give a new algorithmic framework for translating between notions of stability, which we instantiate to answer several open questions in replicability and privacy. This includes giving sample-efficient replicable algorithms for various PAC learning, distribution estimation, and distribution testing problems, algorithmic amplification of δ in approximate DP, conversions from item-level to user-level privacy, and the existence of private agnostic-to-realizable learning reductions under structured distributions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/02/2019

Relations among different privacy notions

We present a comprehensive view of the relations among several privacy n...
research
04/07/2023

Replicability and stability in learning

Replicability is essential in science as it allows us to validate and ve...
research
08/11/2021

Statistical Inference in the Differential Privacy Model

In modern settings of data analysis, we may be running our algorithms on...
research
02/09/2019

Passing Tests without Memorizing: Two Models for Fooling Discriminators

We introduce two mathematical frameworks for foolability in the context ...
research
04/07/2021

Optimal Algorithms for Differentially Private Stochastic Monotone Variational Inequalities and Saddle-Point Problems

In this work, we conduct the first systematic study of stochastic variat...
research
11/02/2019

Adaptive Statistical Learning with Bayesian Differential Privacy

In statistical learning, a dataset is often partitioned into two parts: ...
research
10/14/2017

Learners that Leak Little Information

We study learning algorithms that are restricted to revealing little inf...

Please sign up or login with your details

Forgot password? Click here to reset