Membership Inference Attacks and Defenses in Supervised Learning via Generalization Gap

by   Jiacheng Li, et al.

This work studies membership inference (MI) attack against classifiers, where the attacker's goal is to determine whether a data instance was used for training the classifier. While it is known that overfitting makes classifiers susceptible to MI attacks, we showcase a simple numerical relationship between the generalization gap—the difference between training and test accuracies—and the classifier's vulnerability to MI attacks—as measured by an MI attack's accuracy gain over a random guess. We then propose to close the gap by matching the training and validation accuracies during training, by means of a new set regularizer using the Maximum Mean Discrepancy between the softmax output empirical distributions of the training and validation sets. Our experimental results show that combining this approach with another simple defense (mix-up training) significantly improves state-of-the-art defense against MI attacks, with minimal impact on testing accuracy.


Defending Model Inversion and Membership Inference Attacks via Prediction Purification

Neural networks are susceptible to data inference attacks such as the mo...

Adv-DWF: Defending Against Deep-Learning-Based Website Fingerprinting Attacks with Adversarial Traces

Website Fingerprinting (WF) is a type of traffic analysis attack that en...

Killing Three Birds with one Gaussian Process: Analyzing Attack Vectors on Classification

The wide usage of Machine Learning (ML) has lead to research on the atta...

Quantifying Membership Inference Vulnerability via Generalization Gap and Other Model Metrics

We demonstrate how a target model's generalization gap leads directly to...

A BIC based Mixture Model Defense against Data Poisoning Attacks on Classifiers

Data Poisoning (DP) is an effective attack that causes trained classifie...

NeuGuard: Lightweight Neuron-Guided Defense against Membership Inference Attacks

Membership inference attacks (MIAs) against machine learning models can ...

A Mixture Model Based Defense for Data Poisoning Attacks Against Naive Bayes Spam Filters

Naive Bayes spam filters are highly susceptible to data poisoning attack...

Please sign up or login with your details

Forgot password? Click here to reset