Diffusion Models for Adversarial Purification

05/16/2022
by   Weili Nie, et al.
38

Adversarial purification refers to a class of defense methods that remove adversarial perturbations using a generative model. These methods do not make assumptions on the form of attack and the classification model, and thus can defend pre-existing classifiers against unseen threats. However, their performance currently falls behind adversarial training methods. In this work, we propose DiffPure that uses diffusion models for adversarial purification: Given an adversarial example, we first diffuse it with a small amount of noise following a forward diffusion process, and then recover the clean image through a reverse generative process. To evaluate our method against strong adaptive attacks in an efficient and scalable way, we propose to use the adjoint method to compute full gradients of the reverse generative process. Extensive experiments on three image datasets including CIFAR-10, ImageNet and CelebA-HQ with three classifier architectures including ResNet, WideResNet and ViT demonstrate that our method achieves the state-of-the-art results, outperforming current adversarial training and adversarial purification methods, often by a large margin. Project page: https://diffpure.github.io.

READ FULL TEXT

page 4

page 22

research
05/24/2023

Robust Classification via a Single Diffusion Model

Recently, diffusion models have been successfully applied to improving a...
research
03/27/2023

Classifier Robustness Enhancement Via Test-Time Transformation

It has been recently discovered that adversarially trained classifiers e...
research
05/25/2023

CARSO: Counter-Adversarial Recall of Synthetic Observations

In this paper, we propose a novel adversarial defence mechanism for imag...
research
09/19/2023

Language Guided Adversarial Purification

Adversarial purification using generative models demonstrates strong adv...
research
07/31/2023

Universal Adversarial Defense in Remote Sensing Based on Pre-trained Denoising Diffusion Models

Deep neural networks (DNNs) have achieved tremendous success in many rem...
research
01/12/2022

Adversarially Robust Classification by Conditional Generative Model Inversion

Most adversarial attack defense methods rely on obfuscating gradients. T...
research
08/03/2018

DeepCloak: Adversarial Crafting As a Defensive Measure to Cloak Processes

Over the past decade, side-channels have proven to be significant and pr...

Please sign up or login with your details

Forgot password? Click here to reset