Two Souls in an Adversarial Image: Towards Universal Adversarial Example Detection using Multi-view Inconsistency

09/25/2021
by   Sohaib Kiani, et al.
0

In the evasion attacks against deep neural networks (DNN), the attacker generates adversarial instances that are visually indistinguishable from benign samples and sends them to the target DNN to trigger misclassifications. In this paper, we propose a novel multi-view adversarial image detector, namely Argos, based on a novel observation. That is, there exist two "souls" in an adversarial instance, i.e., the visually unchanged content, which corresponds to the true label, and the added invisible perturbation, which corresponds to the misclassified label. Such inconsistencies could be further amplified through an autoregressive generative approach that generates images with seed pixels selected from the original image, a selected label, and pixel distributions learned from the training data. The generated images (i.e., the "views") will deviate significantly from the original one if the label is adversarial, demonstrating inconsistencies that Argos expects to detect. To this end, Argos first amplifies the discrepancies between the visual content of an image and its misclassified label induced by the attack using a set of regeneration mechanisms and then identifies an image as adversarial if the reproduced views deviate to a preset degree. Our experimental results show that Argos significantly outperforms two representative adversarial detectors in both detection accuracy and robustness against six well-known adversarial attacks. Code is available at: https://github.com/sohaib730/Argos-Adversarial_Detection

READ FULL TEXT
research
09/03/2023

Turn Fake into Real: Adversarial Head Turn Attacks Against Deepfake Detection

Malicious use of deepfakes leads to serious public concerns and reduces ...
research
08/04/2022

A New Kind of Adversarial Example

Almost all adversarial attacks are formulated to add an imperceptible pe...
research
06/19/2020

Adversarial Attacks for Multi-view Deep Models

Recent work has highlighted the vulnerability of many deep machine learn...
research
12/11/2018

Adversarial Framing for Image and Video Classification

Neural networks are prone to adversarial attacks. In general, such attac...
research
11/07/2022

Black-Box Attack against GAN-Generated Image Detector with Contrastive Perturbation

Visually realistic GAN-generated facial images raise obvious concerns on...
research
06/29/2022

Adversarial Ensemble Training by Jointly Learning Label Dependencies and Member Models

Training an ensemble of different sub-models has empirically proven to b...
research
04/17/2021

Fashion-Guided Adversarial Attack on Person Segmentation

This paper presents the first adversarial example based method for attac...

Please sign up or login with your details

Forgot password? Click here to reset