Privacy Risks of Securing Machine Learning Models against Adversarial Examples

by   Liwei Song, et al.
National University of Singapore
Princeton University

The arms race between attacks and defenses for machine learning models has come to a forefront in recent years, in both the security community and the privacy community. However, one big limitation of previous research is that the security domain and the privacy domain have typically been considered separately. It is thus unclear whether the defense methods in one domain will have any unexpected impact on the other domain. In this paper, we take a step towards resolving this limitation by combining the two domains. In particular, we measure the success of membership inference attacks against six state-of-the-art adversarial defense methods that mitigate adversarial examples (i.e., evasion attacks). Membership inference attacks aim to infer an individual's participation in the target model's training set and are known to be correlated with target model's overfitting and sensitivity with regard to training data. Meanwhile, adversarial defense methods aim to enhance the robustness of target models by ensuring that model predictions are unchanged for a small area around each training sample. Thus, adversarial defenses typically have a more fine-grained reliance on the training set and make the target model more vulnerable to membership inference attacks. To perform the membership inference attacks, we leverage the conventional inference method based on prediction confidence and propose two new inference methods that exploit structural properties of adversarially robust defenses. Our experimental evaluation demonstrates that compared with the natural training (undefended) approach, adversarial defense methods can indeed increase the target model's risk against membership inference attacks. When applying adversarial defenses to train the robust models, the membership inference advantage increases by up to 4.5 times compared to the naturally undefended models.


page 1

page 2

page 3

page 4


Label-Only Membership Inference Attacks

Membership inference attacks are one of the simplest forms of privacy le...

NeuGuard: Lightweight Neuron-Guided Defense against Membership Inference Attacks

Membership inference attacks (MIAs) against machine learning models can ...

On the Robustness of Ensemble-Based Machine Learning Against Data Poisoning

Machine learning is becoming ubiquitous. From financial to medicine, mac...

Segmentations-Leak: Membership Inference Attacks and Defenses in Semantic Image Segmentation

Today's success of state of the art methods for semantic segmentation is...

Leveraging Adversarial Examples to Quantify Membership Information Leakage

The use of personal data for training machine learning systems comes wit...

Membership Inference Attacks via Adversarial Examples

The raise of machine learning and deep learning led to significant impro...

Canary in a Coalmine: Better Membership Inference with Ensembled Adversarial Queries

As industrial applications are increasingly automated by machine learnin...

Please sign up or login with your details

Forgot password? Click here to reset