Intrinsic Certified Robustness of Bagging against Data Poisoning Attacks

08/11/2020
by   Jinyuan Jia, et al.
0

In a data poisoning attack, an attacker modifies, deletes, and/or inserts some training examples to corrupt the learnt machine learning model. Bootstrap Aggregating (bagging) is a well-known ensemble learning method, which trains multiple base models on random subsamples of a training dataset using a base learning algorithm and uses majority vote to predict labels of testing examples. We prove the intrinsic certified robustness of bagging against data poisoning attacks. Specifically, we show that bagging with an arbitrary base learning algorithm provably predicts the same label for a testing example when the number of modified, deleted, and/or inserted training examples is bounded by a threshold. Moreover, we show that our derived threshold is tight if no assumptions on the base learning algorithm are made. We empirically evaluate our method on MNIST and CIFAR10. For instance, our method can achieve a certified accuracy of 70.8% on MNIST when arbitrarily modifying, deleting, and/or inserting 100 training examples.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/07/2020

Certified Robustness of Nearest Neighbors against Data Poisoning Attacks

Data poisoning attacks aim to corrupt a machine learning model via modif...
research
02/03/2021

Provably Secure Federated Learning against Malicious Clients

Federated learning enables clients to collaboratively learn a shared glo...
research
10/29/2019

Distribution Density, Tails, and Outliers in Machine Learning: Metrics and Applications

We develop techniques to quantify the degree to which a given (training ...
research
09/02/2021

Excess Capacity and Backdoor Poisoning

A backdoor data poisoning attack is an adversarial attack wherein the at...
research
06/26/2020

Deep Partition Aggregation: Provable Defense against General Poisoning Attacks

Adversarial poisoning attacks distort training data in order to corrupt ...
research
07/28/2022

Efficient Model Finetuning for Text Classification via Data Filtering

As model finetuning is central to the modern NLP, we set to maximize its...
research
05/26/2022

On Collective Robustness of Bagging Against Data Poisoning

Bootstrap aggregating (bagging) is an effective ensemble protocol, which...

Please sign up or login with your details

Forgot password? Click here to reset