Adversarial Purification through Representation Disentanglement

10/15/2021
by   Tao Bai, et al.
0

Deep learning models are vulnerable to adversarial examples and make incomprehensible mistakes, which puts a threat on their real-world deployment. Combined with the idea of adversarial training, preprocessing-based defenses are popular and convenient to use because of their task independence and good generalizability. Current defense methods, especially purification, tend to remove “noise" by learning and recovering the natural images. However, different from random noise, the adversarial patterns are much easier to be overfitted during model training due to their strong correlation to the images. In this work, we propose a novel adversarial purification scheme by presenting disentanglement of natural images and adversarial perturbations as a preprocessing defense. With extensive experiments, our defense is shown to be generalizable and make significant protection against unseen strong adversarial attacks. It reduces the success rates of state-of-the-art ensemble attacks from 61.7% to 14.9% on average, superior to a number of existing methods. Notably, our defense restores the perturbed images perfectly and does not hurt the clean accuracy of backbone models, which is highly desirable in practice.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 3

page 4

page 6

page 7

06/22/2020

Perceptual Adversarial Robustness: Defense Against Unseen Threat Models

We present adversarial attacks and defenses for the perceptual adversari...
08/06/2018

Defense Against Adversarial Attacks with Saak Transform

Deep neural networks (DNNs) are known to be vulnerable to adversarial pe...
12/29/2021

Invertible Image Dataset Protection

Deep learning has achieved enormous success in various industrial applic...
05/06/2020

GraCIAS: Grassmannian of Corrupted Images for Adversarial Security

Input transformation based defense strategies fall short in defending ag...
06/11/2021

Adversarial purification with Score-based generative models

While adversarial training is considered as a standard defense method ag...
12/16/2021

All You Need is RAW: Defending Against Adversarial Attacks with Camera Image Pipelines

Existing neural networks for computer vision tasks are vulnerable to adv...
10/16/2019

A New Defense Against Adversarial Images: Turning a Weakness into a Strength

Natural images are virtually surrounded by low-density misclassified reg...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.