Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples

05/24/2016
by   Nicolas Papernot, et al.
0

Many machine learning models are vulnerable to adversarial examples: inputs that are specially crafted to cause a machine learning model to produce an incorrect output. Adversarial examples that affect one model often affect another model, even if the two models have different architectures or were trained on different training sets, so long as both models were trained to perform the same task. An attacker may therefore train their own substitute model, craft adversarial examples against the substitute, and transfer them to a victim model, with very little information about the victim. Recent work has further developed a technique that uses the victim model as an oracle to label a synthetic training set for the substitute, so the attacker need not even collect a training set to mount the attack. We extend these recent techniques using reservoir sampling to greatly enhance the efficiency of the training procedure for the substitute model. We introduce new transferability attacks between previously unexplored (substitute, victim) pairs of machine learning model classes, most notably SVMs and decision trees. We demonstrate our attacks on two commercial machine learning classification systems from Amazon (96.19 misclassification rate) and Google (88.94 victim model, thereby showing that existing machine learning approaches are in general vulnerable to systematic black-box attacks regardless of their structure.

READ FULL TEXT

page 1

page 4

page 5

research
08/17/2017

Machine Learning as an Adversarial Service: Learning Black-Box Adversarial Examples

Neural networks are known to be vulnerable to adversarial examples, inpu...
research
11/17/2020

Generating universal language adversarial examples by understanding and enhancing the transferability across neural models

Deep neural network models are vulnerable to adversarial attacks. In man...
research
11/20/2018

Intermediate Level Adversarial Attack for Enhanced Transferability

Neural networks are vulnerable to adversarial examples, malicious inputs...
research
03/28/2020

DaST: Data-free Substitute Training for Adversarial Attacks

Machine learning models are vulnerable to adversarial examples. For the ...
research
12/04/2020

Practical No-box Adversarial Attacks against DNNs

The study of adversarial vulnerabilities of deep neural networks (DNNs) ...
research
06/12/2023

When Vision Fails: Text Attacks Against ViT and OCR

While text-based machine learning models that operate on visual inputs o...
research
06/18/2021

Bad Characters: Imperceptible NLP Attacks

Several years of research have shown that machine-learning systems are v...

Please sign up or login with your details

Forgot password? Click here to reset