Indiscriminate Poisoning Attacks Are Shortcuts

11/01/2021
by   Da Yu, et al.
0

Indiscriminate data poisoning attacks, which add imperceptible perturbations to training data to maximize the test error of trained models, have become a trendy topic because they are thought to be capable of preventing unauthorized use of data. In this work, we investigate why these perturbations work in principle. We find that the perturbations of advanced poisoning attacks are almost linear separable when assigned with the target labels of the corresponding samples, which hence can work as shortcuts for the learning objective. This important population property has not been unveiled before. Moreover, we further verify that linear separability is indeed the workhorse for poisoning attacks. We synthesize linear separable data as perturbations and show that such synthetic perturbations are as powerful as the deliberately crafted attacks. Our finding suggests that the shortcut learning problem is more serious than previously believed as deep learning heavily relies on shortcuts even if they are of an imperceptible scale and mixed together with the normal features. This finding also suggests that pre-trained feature extractors would disable these poisoning attacks effectively.

READ FULL TEXT
research
03/18/2021

TOP: Backdoor Detection in Neural Networks via Transferability of Perturbation

Deep neural networks (DNNs) are vulnerable to "backdoor" poisoning attac...
research
07/17/2019

Adversarial Security Attacks and Perturbations on Machine Learning and Deep Learning Methods

The ever-growing big data and emerging artificial intelligence (AI) dema...
research
09/25/2019

Mixup Inference: Better Exploiting Mixup to Defend Adversarial Attacks

It has been widely recognized that adversarial examples can be easily cr...
research
12/01/2021

Adv-4-Adv: Thwarting Changing Adversarial Perturbations via Adversarial Domain Adaptation

Whereas adversarial training can be useful against specific adversarial ...
research
02/11/2022

Using Random Perturbations to Mitigate Adversarial Attacks on Sentiment Analysis Models

Attacks on deep learning models are often difficult to identify and ther...
research
10/18/2022

Not All Poisons are Created Equal: Robust Training against Data Poisoning

Data poisoning causes misclassification of test time target examples by ...
research
05/30/2023

What Can We Learn from Unlearnable Datasets?

In an era of widespread web scraping, unlearnable dataset methods have t...

Please sign up or login with your details

Forgot password? Click here to reset