DeepAI AI Chat
Log In Sign Up

Stolen Memories: Leveraging Model Memorization for Calibrated White-Box Membership Inference

by   Klas Leino, et al.

Membership inference (MI) attacks exploit a learned model's lack of generalization to infer whether a given sample was in the model's training set. Known MI attacks generally work by casting the attacker's goal as a supervised learning problem, training an attack model from predictions generated by the target model, or by others like it. However, we find that these attacks do not often provide a meaningful basis for confidently inferring training set membership, as the attack models are not well-calibrated. Moreover, these attacks do not significantly outperform a trivial attack that predicts that a point is a member if and only if the model correctly predicts its label. In this work we present well-calibrated MI attacks that allow the attacker to accurately control the minimum confidence with which positive membership inferences are made. Our attacks take advantage of white-box information about the target model and leverage new insights about how overfitting occurs in deep neural networks; namely, we show how a model's idiosyncratic use of features can provide evidence for membership. Experiments on seven real-world datasets show that our attacks support calibration for high-confidence inferences, while outperforming previous MI attacks in terms of accuracy. Finally, we show that our attacks achieve non-trivial advantage on some models with low generalization error, including those trained with small-epsilon-differential privacy; for large-epsilon (epsilon=16, as reported in some industrial settings), the attack performs comparably to unprotected models.


Defending Model Inversion and Membership Inference Attacks via Prediction Purification

Neural networks are susceptible to data inference attacks such as the mo...

l-Leaks: Membership Inference Attacks with Logits

Machine Learning (ML) has made unprecedented progress in the past severa...

User-Level Membership Inference Attack against Metric Embedding Learning

Membership inference (MI) determines if a sample was part of a victim mo...

Purifier: Defending Data Inference Attacks via Transforming Confidence Scores

Neural networks are susceptible to data inference attacks such as the me...

An Efficient Subpopulation-based Membership Inference Attack

Membership inference attacks allow a malicious entity to predict whether...

DAMIA: Leveraging Domain Adaptation as a Defense against Membership Inference Attacks

Deep Learning (DL) techniques allow ones to train models from a dataset ...

Practical Blind Membership Inference Attack via Differential Comparisons

Membership inference (MI) attacks affect user privacy by inferring wheth...