Not All Poisons are Created Equal: Robust Training against Data Poisoning

10/18/2022
by   Yu Yang, et al.
0

Data poisoning causes misclassification of test time target examples by injecting maliciously crafted samples in the training data. Existing defenses are often effective only against a specific type of targeted attack, significantly degrade the generalization performance, or are prohibitive for standard deep learning pipelines. In this work, we propose an efficient defense mechanism that significantly reduces the success rate of various data poisoning attacks, and provides theoretical guarantees for the performance of the model. Targeted attacks work by adding bounded perturbations to a randomly selected subset of training data to match the targets' gradient or representation. We show that: (i) under bounded perturbations, only a number of poisons can be optimized to have a gradient that is close enough to that of the target and make the attack successful; (ii) such effective poisons move away from their original class and get isolated in the gradient space; (iii) dropping examples in low-density gradient regions during training can successfully eliminate the effective poisons, and guarantees similar training dynamics to that of training on full data. Our extensive experiments show that our method significantly decreases the success rate of state-of-the-art targeted attacks, including Gradient Matching and Bullseye Polytope, and easily scales to large datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/14/2022

Friendly Noise against Adversarial Noise: A Powerful Defense against Data Poisoning Attacks

A powerful category of data poisoning attacks modify a subset of trainin...
research
09/15/2023

HINT: Healthy Influential-Noise based Training to Defend against Data Poisoning Attacks

While numerous defense methods have been proposed to prohibit potential ...
research
09/04/2020

Witches' Brew: Industrial Scale Data Poisoning via Gradient Matching

Data Poisoning attacks involve an attacker modifying training data to ma...
research
07/05/2020

Reflection Backdoor: A Natural Backdoor Attack on Deep Neural Networks

Recent studies have shown that DNNs can be compromised by backdoor attac...
research
10/08/2021

Dyn-Backdoor: Backdoor Attack on Dynamic Link Prediction

Dynamic link prediction (DLP) makes graph prediction based on historical...
research
06/30/2020

Model-Targeted Poisoning Attacks: Provable Convergence and Certified Bounds

Machine learning systems that rely on training data collected from untru...
research
11/01/2021

Indiscriminate Poisoning Attacks Are Shortcuts

Indiscriminate data poisoning attacks, which add imperceptible perturbat...

Please sign up or login with your details

Forgot password? Click here to reset