Defending Neural Backdoors via Generative Distribution Modeling

10/10/2019
by   Ximing Qiao, et al.
0

Neural backdoor attack is emerging as a severe security threat to deep learning, while the capability of existing defense methods is limited, especially for complex backdoor triggers. In the work, we explore the space formed by the pixel values of all possible backdoor triggers. An original trigger used by an attacker to build the backdoored model represents only a point in the space. It then will be generalized into a distribution of valid triggers, all of which can influence the backdoored model. Thus, previous methods that model only one point of the trigger distribution is not sufficient. Getting the entire trigger distribution, e.g., via generative modeling, is a key to effective defense. However, existing generative modeling techniques for image generation are not applicable to the backdoor scenario as the trigger distribution is completely unknown. In this work, we propose max-entropy staircase approximator (MESA), an algorithm for high-dimensional sampling-free generative modeling and use it to recover the trigger distribution. We also develop a defense technique to remove the triggers from the backdoored model. Our experiments on Cifar10 dataset demonstrate the effectiveness of MESA in modeling the trigger distribution and the robustness of the proposed defense method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/17/2022

Threat Model-Agnostic Adversarial Defense using Diffusion Models

Deep Neural Networks (DNNs) are highly sensitive to imperceptible malici...
research
05/17/2018

Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models

In recent years, deep neural network approaches have been widely adopted...
research
11/20/2020

ONION: A Simple and Effective Defense Against Textual Backdoor Attacks

Backdoor attacks, which are a kind of emergent training-time threat to d...
research
12/11/2020

Analyzing and Improving Generative Adversarial Training for Generative Modeling and Out-of-Distribution Detection

Generative adversarial training (GAT) is a recently introduced adversari...
research
08/12/2022

Scale-free Photo-realistic Adversarial Pattern Attack

Traditional pixel-wise image attack algorithms suffer from poor robustne...
research
05/24/2022

EBM Life Cycle: MCMC Strategies for Synthesis, Defense, and Density Modeling

This work presents strategies to learn an Energy-Based Model (EBM) accor...
research
02/22/2021

Generative Archimedean Copulas

We propose a new generative modeling technique for learning multidimensi...

Please sign up or login with your details

Forgot password? Click here to reset