Stolen Memories: Leveraging Model Memorization for Calibrated White-Box Membership Inference

by   Klas Leino, et al.

Membership inference (MI) attacks exploit a learned model's lack of generalization to infer whether a given sample was in the model's training set. Known MI attacks generally work by casting the attacker's goal as a supervised learning problem, training an attack model from predictions generated by the target model, or by others like it. However, we find that these attacks do not often provide a meaningful basis for confidently inferring training set membership, as the attack models are not well-calibrated. Moreover, these attacks do not significantly outperform a trivial attack that predicts that a point is a member if and only if the model correctly predicts its label. In this work we present well-calibrated MI attacks that allow the attacker to accurately control the minimum confidence with which positive membership inferences are made. Our attacks take advantage of white-box information about the target model and leverage new insights about how overfitting occurs in deep neural networks; namely, we show how a model's idiosyncratic use of features can provide evidence for membership. Experiments on seven real-world datasets show that our attacks support calibration for high-confidence inferences, while outperforming previous MI attacks in terms of accuracy. Finally, we show that our attacks achieve non-trivial advantage on some models with low generalization error, including those trained with small-epsilon-differential privacy; for large-epsilon (epsilon=16, as reported in some industrial settings), the attack performs comparably to unprotected models.



There are no comments yet.


page 2


Defending Model Inversion and Membership Inference Attacks via Prediction Purification

Neural networks are susceptible to data inference attacks such as the mo...

User-Level Membership Inference Attack against Metric Embedding Learning

Membership inference (MI) determines if a sample was part of a victim mo...

On the Importance of Difficulty Calibration in Membership Inference Attacks

The vulnerability of machine learning models to membership inference att...

An Efficient Subpopulation-based Membership Inference Attack

Membership inference attacks allow a malicious entity to predict whether...

Generative Model: Membership Attack,Generalization and Diversity

This paper considers membership attacks to deep generative models, which...

DAMIA: Leveraging Domain Adaptation as a Defense against Membership Inference Attacks

Deep Learning (DL) techniques allow ones to train models from a dataset ...

Alleviating Privacy Attacks via Causal Learning

Machine learning models, especially deep neural networks have been shown...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.