Deep Partition Aggregation: Provable Defense against General Poisoning Attacks

06/26/2020
by   Alexander Levine, et al.
0

Adversarial poisoning attacks distort training data in order to corrupt the test-time behavior of a classifier. A provable defense provides a certificate for each test sample, which is a lower bound on the magnitude of any adversarial distortion of the training set that can corrupt the test sample's classification. We propose two provable defenses against poisoning attacks: (i) Deep Partition Aggregation (DPA), a certified defense against a general poisoning threat model, defined as the insertion or deletion of a bounded number of samples to the training set – by implication, this threat model also includes arbitrary distortions to a bounded number of images and/or labels; and (ii) Semi-Supervised DPA (SS-DPA), a certified defense against label-flipping poisoning attacks. DPA is an ensemble method where base models are trained on partitions of the training set determined by a hash function. DPA is related to subset aggregation, a well-studied ensemble method in classical machine learning. DPA can also be viewed as an extension of randomized ablation (Levine Feizi, 2020a), a certified defense against sparse evasion attacks, to the poisoning domain. Our label-flipping defense, SS-DPA, uses a semi-supervised learning algorithm as its base classifier model: we train each base classifier using the entire unlabeled training set in addition to the labels for a partition. SS-DPA outperforms the existing certified defense for label-flipping attacks (Rosenfeld et al., 2020). SS-DPA certifies >= 50 against 675 label flips (vs. < 200 label flips with the existing defense) on MNIST and 83 label flips on CIFAR-10. Against general poisoning attacks (no prior certified defense), DPA certifies >= 50 poison image insertions on MNIST, and nine insertions on CIFAR-10. These results establish new state-of-the-art provable defenses against poison attacks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/05/2023

Run-Off Election: Improved Provable Defense against Data Poisoning Attacks

In data poisoning attacks, an adversary tries to change a model's predic...
research
01/19/2021

On Provable Backdoor Defense in Collaborative Learning

As collaborative learning allows joint training of a model using multipl...
research
01/27/2023

PECAN: A Deterministic Certified Defense Against Backdoor Attacks

Neural networks are vulnerable to backdoor poisoning attacks, where the ...
research
02/22/2023

Feature Partition Aggregation: A Fast Certified Defense Against a Union of Sparse Adversarial Attacks

Deep networks are susceptible to numerous types of adversarial attacks. ...
research
02/12/2019

A new Backdoor Attack in CNNs by training set corruption without label poisoning

Backdoor attacks against CNNs represent a new threat against deep learni...
research
08/11/2020

Intrinsic Certified Robustness of Bagging against Data Poisoning Attacks

In a data poisoning attack, an attacker modifies, deletes, and/or insert...
research
02/07/2020

Certified Robustness to Label-Flipping Attacks via Randomized Smoothing

Machine learning algorithms are known to be susceptible to data poisonin...

Please sign up or login with your details

Forgot password? Click here to reset