Differentially Private Learning Does Not Bound Membership Inference

by   Thomas Humphries, et al.

Training machine learning models on privacy-sensitive data has become a popular practice, driving innovation in ever-expanding fields. This has opened the door to a series of new attacks, such as Membership Inference Attacks (MIAs), that exploit vulnerabilities in ML models in order to expose the privacy of individual training samples. A growing body of literature holds up Differential Privacy (DP) as an effective defense against such attacks, and companies like Google and Amazon include this privacy notion in their machine-learning-as-a-service products. However, little scrutiny has been given to how underlying correlations within the datasets used for training these models can impact the privacy guarantees provided by DP. In this work, we challenge prior findings that suggest DP provides a strong defense against MIAs. We provide theoretical and experimental evidence for cases where the theoretical bounds of DP are violated by MIAs using the same attacks described in prior work. We show this hypothetically with artificial, pathological datasets as well as with real-world datasets carefully split to create a distinction between member and non-member samples. Our findings suggest that certain properties of datasets, such as bias or data correlation, play a critical role in determining the effectiveness of DP as a privacy preserving mechanism against MIAs. Further, ensuring that a given dataset is resilient against these MIAs may be virtually impossible for a data analyst to determine.



page 1

page 2

page 3

page 4


Bounding Membership Inference

Differential Privacy (DP) is the de facto standard for reasoning about t...

Differentially Private Data Generative Models

Deep neural networks (DNNs) have recently been widely adopted in various...

Privacy for All: Demystify Vulnerability Disparity of Differential Privacy against Membership Inference Attack

Machine learning algorithms, when applied to sensitive data, pose a pote...

DTGAN: Differential Private Training for Tabular GANs

Tabular generative adversarial networks (TGAN) have recently emerged to ...

DP-UTIL: Comprehensive Utility Analysis of Differential Privacy in Machine Learning

Differential Privacy (DP) has emerged as a rigorous formalism to reason ...

Fairness and Cost Constrained Privacy-Aware Record Linkage

Record linkage algorithms match and link records from different database...

Feature Space Hijacking Attacks against Differentially Private Split Learning

Split learning and differential privacy are technologies with growing po...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.