A New Approach to Adaptive Data Analysis and Learning via Maximal Leakage

03/05/2019
by   Amedeo Roberto Esposito, et al.
0

There is an increasing concern that most current published research findings are false. The main cause seems to lie in the fundamental disconnection between theory and practice in data analysis. While the former typically relies on statistical independence, the latter is an inherently adaptive process: new hypotheses are formulated based on the outcomes of previous analyses. A recent line of work tries to mitigate these issues by enforcing constraints, such as differential privacy, that compose adaptively while degrading gracefully and thus provide statistical guarantees even in adaptive contexts. Our contribution consists in the introduction of a new approach, based on the concept of Maximal Leakage, an information-theoretic measure of leakage of information. The main result allows us to compare the probability of an event happening when adaptivity is considered with respect to the non-adaptive scenario. The bound we derive represents a generalization of the bounds used in non-adaptive scenarios (e.g., McDiarmid's inequality for c-sensitive functions, false discovery error control via significance level, etc.), and allows us to replicate or even improve, in certain regimes, the results obtained using Max-Information or Differential Privacy. In contrast with the line of work started by Dwork et al., our results do not rely on Differential Privacy but are, in principle, applicable to every algorithm that has a bounded leakage, including the differentially private algorithms and the ones with a short description length.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/15/2023

(α,β)-Leakage: A Unified Privacy Leakage Measure

We introduce a family of information leakage measures called maximal (α,...
research
02/11/2023

On Differential Privacy and Adaptive Data Analysis with Bounded Space

We study the space complexity of the two related fields of differential ...
research
01/18/2018

On the Contractivity of Privacy Mechanisms

We present a novel way to compare the statistical cost of privacy mechan...
research
10/30/2019

Chasing Accuracy and Privacy, and Catching Both: A Literature Survey on Differentially Private Histogram Publication

Histograms and synthetic data are of key importance in data analysis. Ho...
research
05/10/2022

Pointwise Maximal Leakage

We introduce a privacy measure called pointwise maximal leakage, defined...
research
12/19/2017

Calibrating Noise to Variance in Adaptive Data Analysis

Datasets are often used multiple times and each successive analysis may ...
research
10/31/2016

Bayesian Adaptive Data Analysis Guarantees from Subgaussianity

The new field of adaptive data analysis seeks to provide algorithms and ...

Please sign up or login with your details

Forgot password? Click here to reset