The Space of Transferable Adversarial Examples

04/11/2017
by   Florian Tramèr, et al.
0

Adversarial examples are maliciously perturbed inputs designed to mislead machine learning (ML) models at test-time. They often transfer: the same adversarial example fools more than one model. In this work, we propose novel methods for estimating the previously unknown dimensionality of the space of adversarial inputs. We find that adversarial examples span a contiguous subspace of large ( 25) dimensionality. Adversarial subspaces with higher dimensionality are more likely to intersect. We find that for two different models, a significant fraction of their subspaces is shared, thus enabling transferability. In the first quantitative analysis of the similarity of different models' decision boundaries, we show that these boundaries are actually close in arbitrary directions, whether adversarial or benign. We conclude by formally studying the limits of transferability. We derive (1) sufficient conditions on the data distribution that imply transferability for simple model classes and (2) examples of scenarios in which transfer does not occur. These findings indicate that it may be possible to design defenses against transfer-based attacks, even for models that are vulnerable to direct attacks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/25/2020

Does Adversarial Transferability Indicate Knowledge Transferability?

Despite the immense success that deep neural networks (DNNs) have achiev...
research
03/17/2020

Adversarial Transferability in Wearable Sensor Systems

Machine learning has increasingly become the most used approach for infe...
research
05/20/2017

Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods

Neural networks are known to be vulnerable to adversarial examples: inpu...
research
08/23/2022

Transferability Ranking of Adversarial Examples

Adversarial examples can be used to maliciously and covertly change a mo...
research
11/22/2021

Evaluating Adversarial Attacks on ImageNet: A Reality Check on Misclassification Classes

Although ImageNet was initially proposed as a dataset for performance be...
research
02/03/2022

Adversarially Robust Models may not Transfer Better: Sufficient Conditions for Domain Transferability from the View of Regularization

Machine learning (ML) robustness and domain generalization are fundament...
research
10/23/2018

One Bit Matters: Understanding Adversarial Examples as the Abuse of Redundancy

Despite the great success achieved in machine learning (ML), adversarial...

Please sign up or login with your details

Forgot password? Click here to reset