Diffusion-Based Adversarial Sample Generation for Improved Stealthiness and Controllability

05/25/2023
by   Haotian Xue, et al.
0

Neural networks are known to be susceptible to adversarial samples: small variations of natural examples crafted to deliberately mislead the models. While they can be easily generated using gradient-based techniques in digital and physical scenarios, they often differ greatly from the actual data distribution of natural images, resulting in a trade-off between strength and stealthiness. In this paper, we propose a novel framework dubbed Diffusion-Based Projected Gradient Descent (Diff-PGD) for generating realistic adversarial samples. By exploiting a gradient guided by a diffusion model, Diff-PGD ensures that adversarial samples remain close to the original data distribution while maintaining their effectiveness. Moreover, our framework can be easily customized for specific tasks such as digital attacks, physical-world attacks, and style-based attacks. Compared with existing methods for generating natural-style adversarial samples, our framework enables the separation of optimizing adversarial loss from other surrogate losses (e.g., content/smoothness/style loss), making it more stable and controllable. Finally, we demonstrate that the samples generated using Diff-PGD have better transferability and anti-purification power than traditional gradient-based methods. Code will be released in https://github.com/xavihart/Diff-PGD

READ FULL TEXT

page 7

page 8

page 17

page 18

page 19

page 21

page 23

page 24

research
08/16/2018

Distributionally Adversarial Attack

Recent work on adversarial attack has shown that Projected Gradient Desc...
research
10/28/2021

Bridge the Gap Between CV and NLP! A Gradient-based Textual Adversarial Attack Framework

Despite great success on many machine learning tasks, deep neural networ...
research
08/24/2023

Exploring Transferability of Multimodal Adversarial Samples for Vision-Language Pre-training Models with Contrastive Learning

Vision-language pre-training models (VLP) are vulnerable, especially to ...
research
07/09/2023

GNP Attack: Transferable Adversarial Examples via Gradient Norm Penalty

Adversarial examples (AE) with good transferability enable practical bla...
research
12/09/2020

Generating Out of Distribution Adversarial Attack using Latent Space Poisoning

Traditional adversarial attacks rely upon the perturbations generated by...
research
05/16/2023

Generating coherent comic with rich story using ChatGPT and Stable Diffusion

Past work demonstrated that using neural networks, we can extend unfinis...
research
10/16/2019

A New Defense Against Adversarial Images: Turning a Weakness into a Strength

Natural images are virtually surrounded by low-density misclassified reg...

Please sign up or login with your details

Forgot password? Click here to reset