Language Guided Adversarial Purification

09/19/2023
by   Himanshu Singh, et al.
0

Adversarial purification using generative models demonstrates strong adversarial defense performance. These methods are classifier and attack-agnostic, making them versatile but often computationally intensive. Recent strides in diffusion and score networks have improved image generation and, by extension, adversarial purification. Another highly efficient class of adversarial defense methods known as adversarial training requires specific knowledge of attack vectors, forcing them to be trained extensively on adversarial examples. To overcome these limitations, we introduce a new framework, namely Language Guided Adversarial Purification (LGAP), utilizing pre-trained diffusion models and caption generators to defend against adversarial attacks. Given an input image, our method first generates a caption, which is then used to guide the adversarial purification process through a diffusion network. Our approach has been evaluated against strong adversarial attacks, proving its effectiveness in enhancing adversarial robustness. Our results indicate that LGAP outperforms most existing adversarial defense techniques without requiring specialized network training. This underscores the generalizability of models trained on large datasets, highlighting a promising direction for further research.

READ FULL TEXT
research
06/22/2022

Guided Diffusion Model for Adversarial Purification from Random Noise

In this paper, we propose a novel guided diffusion purification approach...
research
07/10/2023

Enhancing Adversarial Robustness via Score-Based Optimization

Adversarial attacks have the potential to mislead deep neural network cl...
research
06/28/2021

Feature Importance Guided Attack: A Model Agnostic Adversarial Attack

Machine learning models are susceptible to adversarial attacks which dra...
research
12/23/2021

Adaptive Modeling Against Adversarial Attacks

Adversarial training, the process of training a deep learning model with...
research
05/16/2022

Diffusion Models for Adversarial Purification

Adversarial purification refers to a class of defense methods that remov...
research
06/15/2023

DIFFender: Diffusion-Based Adversarial Defense against Patch Attacks in the Physical World

Adversarial attacks in the physical world, particularly patch attacks, p...
research
01/27/2021

Adversarial Stylometry in the Wild: Transferable Lexical Substitution Attacks on Author Profiling

Written language contains stylistic cues that can be exploited to automa...

Please sign up or login with your details

Forgot password? Click here to reset