Silent Killer: Optimizing Backdoor Trigger Yields a Stealthy and Powerful Data Poisoning Attack

01/05/2023
by   Tzvi Lederer, et al.
0

We propose a stealthy and powerful backdoor attack on neural networks based on data poisoning (DP). In contrast to previous attacks, both the poison and the trigger in our method are stealthy. We are able to change the model's classification of samples from a source class to a target class chosen by the attacker. We do so by using a small number of poisoned training samples with nearly imperceptible perturbations, without changing their labels. At inference time, we use a stealthy perturbation added to the attacked samples as a trigger. This perturbation is crafted as a universal adversarial perturbation (UAP), and the poison is crafted using gradient alignment coupled to this trigger. Our method is highly efficient in crafting time compared to previous methods and requires only a trained surrogate model without additional retraining. Our attack achieves state-of-the-art results in terms of attack success rate while maintaining high accuracy on clean samples.

READ FULL TEXT

page 2

page 3

page 7

research
10/07/2020

CD-UAP: Class Discriminative Universal Adversarial Perturbation

A single universal adversarial perturbation (UAP) can be added to all na...
research
11/17/2022

UPTON: Unattributable Authorship Text via Data Poisoning

In online medium such as opinion column in Bloomberg, The Guardian and W...
research
01/06/2021

DeepPoison: Feature Transfer Based Stealthy Poisoning Attack

Deep neural networks are susceptible to poisoning attacks by purposely p...
research
02/15/2018

ASP:A Fast Adversarial Attack Example Generation Framework based on Adversarial Saliency Prediction

With the excellent accuracy and feasibility, the Neural Networks have be...
research
12/06/2018

Towards Leveraging the Information of Gradients in Optimization-based Adversarial Attack

In recent years, deep neural networks demonstrated state-of-the-art perf...
research
05/27/2019

Label Universal Targeted Attack

We introduce Label Universal Targeted Attack (LUTA) that makes a deep mo...
research
11/01/2022

Universal Perturbation Attack on Differentiable No-Reference Image- and Video-Quality Metrics

Universal adversarial perturbation attacks are widely used to analyze im...

Please sign up or login with your details

Forgot password? Click here to reset