On the Universal Adversarial Perturbations for Efficient Data-free Adversarial Detection

06/27/2023
by   Songyang Gao, et al.
0

Detecting adversarial samples that are carefully crafted to fool the model is a critical step to socially-secure applications. However, existing adversarial detection methods require access to sufficient training data, which brings noteworthy concerns regarding privacy leakage and generalizability. In this work, we validate that the adversarial sample generated by attack algorithms is strongly related to a specific vector in the high-dimensional inputs. Such vectors, namely UAPs (Universal Adversarial Perturbations), can be calculated without original training data. Based on this discovery, we propose a data-agnostic adversarial detection framework, which induces different responses between normal and adversarial samples to UAPs. Experimental results show that our method achieves competitive detection performance on various text classification tasks, and maintains an equivalent time consumption to normal inference.

READ FULL TEXT
research
01/24/2018

Generalizable Data-free Objective for Crafting Universal Adversarial Perturbations

Machine learning models are susceptible to adversarial perturbations: sm...
research
07/18/2017

Fast Feature Fool: A data independent approach to universal adversarial perturbations

State-of-the-art object recognition Convolutional Neural Networks (CNNs)...
research
09/14/2021

Improving Gradient-based Adversarial Training for Text Classification by Contrastive Learning and Auto-Encoder

Recent work has proposed several efficient approaches for generating gra...
research
09/12/2023

Quality-Agnostic Deepfake Detection with Intra-model Collaborative Learning

Deepfake has recently raised a plethora of societal concerns over its po...
research
06/21/2023

Universal adversarial perturbations for multiple classification tasks with quantum classifiers

Quantum adversarial machine learning is an emerging field that studies t...
research
11/30/2022

Towards Interpreting Vulnerability of Multi-Instance Learning via Customized and Universal Adversarial Perturbations

Multi-instance learning (MIL) is a great paradigm for dealing with compl...
research
04/01/2021

Normal vs. Adversarial: Salience-based Analysis of Adversarial Samples for Relation Extraction

Recent neural-based relation extraction approaches, though achieving pro...

Please sign up or login with your details

Forgot password? Click here to reset