Adaptive Data Analysis in a Balanced Adversarial Model

05/24/2023
by   Kobbi Nissim, et al.
0

In adaptive data analysis, a mechanism gets n i.i.d. samples from an unknown distribution D, and is required to provide accurate estimations to a sequence of adaptively chosen statistical queries with respect to D. Hardt and Ullman (FOCS 2014) and Steinke and Ullman (COLT 2015) showed that in general, it is computationally hard to answer more than Θ(n^2) adaptive queries, assuming the existence of one-way functions. However, these negative results strongly rely on an adversarial model that significantly advantages the adversarial analyst over the mechanism, as the analyst, who chooses the adaptive queries, also chooses the underlying distribution D. This imbalance raises questions with respect to the applicability of the obtained hardness results – an analyst who has complete knowledge of the underlying distribution D would have little need, if at all, to issue statistical queries to a mechanism which only holds a finite number of samples from D. We consider more restricted adversaries, called balanced, where each such adversary consists of two separated algorithms: The sampler who is the entity that chooses the distribution and provides the samples to the mechanism, and the analyst who chooses the adaptive queries, but does not have a prior knowledge of the underlying distribution. We improve the quality of previous lower bounds by revisiting them using an efficient balanced adversary, under standard public-key cryptography assumptions. We show that these stronger hardness assumptions are unavoidable in the sense that any computationally bounded balanced adversary that has the structure of all known attacks, implies the existence of public-key cryptography.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/15/2017

Generalization for Adaptively-chosen Estimators via Stable Median

Datasets are often reused to perform multiple statistical analyses in an...
research
02/11/2023

On Differential Privacy and Adaptive Data Analysis with Bounded Space

We study the space complexity of the two related fields of differential ...
research
03/12/2018

The Everlasting Database: Statistical Validity at a Fair Price

The problem of handling adaptivity in data analysis, intentional or not,...
research
11/07/2021

Dynamic Algorithms Against an Adaptive Adversary: Generic Constructions and Lower Bounds

A dynamic algorithm against an adaptive adversary is required to be corr...
research
02/13/2016

A Minimax Theory for Adaptive Data Analysis

In adaptive data analysis, the user makes a sequence of queries on the d...
research
09/12/2019

On the Hardness of Robust Classification

It is becoming increasingly important to understand the vulnerability of...
research
02/17/2023

Subsampling Suffices for Adaptive Data Analysis

Ensuring that analyses performed on a dataset are representative of the ...

Please sign up or login with your details

Forgot password? Click here to reset