How to Sift Out a Clean Data Subset in the Presence of Data Poisoning?

10/12/2022
by   Yi Zeng, et al.
25

Given the volume of data needed to train modern machine learning models, external suppliers are increasingly used. However, incorporating external data poses data poisoning risks, wherein attackers manipulate their data to degrade model utility or integrity. Most poisoning defenses presume access to a set of clean data (or base set). While this assumption has been taken for granted, given the fast-growing research on stealthy poisoning attacks, a question arises: can defenders really identify a clean subset within a contaminated dataset to support defenses? This paper starts by examining the impact of poisoned samples on defenses when they are mistakenly mixed into the base set. We analyze five defenses and find that their performance deteriorates dramatically with less than 1 poisoned points in the base set. These findings suggest that sifting out a base set with high precision is key to these defenses' performance. Motivated by these observations, we study how precise existing automated tools and human inspection are at identifying clean data in the presence of data poisoning. Unfortunately, neither effort achieves the precision needed. Worse yet, many of the outcomes are worse than random selection. In addition to uncovering the challenge, we propose a practical countermeasure, Meta-Sift. Our method is based on the insight that existing attacks' poisoned samples shifts from clean data distributions. Hence, training on the clean portion of a dataset and testing on the corrupted portion will result in high prediction loss. Leveraging the insight, we formulate a bilevel optimization to identify clean data and further introduce a suite of techniques to improve efficiency and precision. Our evaluation shows that Meta-Sift can sift a clean base set with 100 attacks. The selected base set is large enough to give rise to successful defenses.

READ FULL TEXT

page 6

page 11

page 18

research
06/14/2022

Turning a Curse Into a Blessing: Enabling Clean-Data-Free Defenses by Model Inversion

It is becoming increasingly common to utilize pre-trained models provide...
research
04/11/2022

Narcissus: A Practical Clean-Label Backdoor Attack with Limited Information

Backdoor attacks insert malicious data into a training set so that, duri...
research
11/02/2018

Stronger Data Poisoning Attacks Break Data Sanitization Defenses

Machine learning models trained on data from the outside world can be co...
research
07/21/2022

ProMix: Combating Label Noise via Maximizing Clean Sample Utility

The ability to train deep neural networks under label noise is appealing...
research
09/18/2020

A Framework of Randomized Selection Based Certified Defenses Against Data Poisoning Attacks

Neural network classifiers are vulnerable to data poisoning attacks, as ...
research
06/01/2019

Enhancing Transformation-based Defenses using a Distribution Classifier

Adversarial attacks on convolutional neural networks (CNN) have gained s...
research
06/14/2021

Backdoor Learning Curves: Explaining Backdoor Poisoning Beyond Influence Functions

Backdoor attacks inject poisoning samples during training, with the goal...

Please sign up or login with your details

Forgot password? Click here to reset