Erase and Restore: Simple, Accurate and Resilient Detection of L_2 Adversarial Examples

01/01/2020
by   Fei Zuo, et al.
14

By adding carefully crafted perturbations to input images, adversarial examples (AEs) can be generated to mislead neural-network-based image classifiers. L_2 adversarial perturbations by Carlini and Wagner (CW) are regarded as among the most effective attacks. While many countermeasures against AEs have been proposed, detection of adaptive CW L_2 AEs has been very inaccurate. Our observation is that those deliberately altered pixels in an L_2 AE, altogether, exert their malicious influence. By randomly erasing some pixels from an L_2 AE and then restoring it with an inpainting technique, such an AE, before and after the steps, tends to have different classification results, while a benign sample does not show this symptom. Based on this, we propose a novel AE detection technique, Erase and Restore (E&R), that exploits the limitation of L_2 attacks. On two popular image datasets, CIFAR-10 and ImageNet, our experiments show that the proposed technique is able to detect over 98 has a very low false positive rate on benign images. Moreover, our approach demonstrate strong resilience to adaptive attacks. While adding noises and inpainting each have been well studied, by combining them together, we deliver a simple, accurate and resilient detection technique against adaptive L_2 AEs.

READ FULL TEXT

page 1

page 2

page 4

research
12/23/2018

Countermeasures Against L0 Adversarial Examples Using Image Processing and Siamese Networks

Despite the great achievements made by neural networks on tasks such as ...
research
08/05/2020

Adv-watermark: A Novel Watermark Perturbation for Adversarial Examples

Recent research has demonstrated that adding some imperceptible perturba...
research
10/03/2019

Perturbations are not Enough: Generating Adversarial Examples with Spatial Distortions

Deep neural network image classifiers are reported to be susceptible to ...
research
01/23/2021

Error Diffusion Halftoning Against Adversarial Examples

Adversarial examples contain carefully crafted perturbations that can fo...
research
05/19/2020

Synthesizing Unrestricted False Positive Adversarial Objects Using Generative Models

Adversarial examples are data points misclassified by neural networks. O...
research
03/10/2022

Frequency-driven Imperceptible Adversarial Attack on Semantic Similarity

Current adversarial attack research reveals the vulnerability of learnin...
research
10/26/2021

Frequency Centric Defense Mechanisms against Adversarial Examples

Adversarial example (AE) aims at fooling a Convolution Neural Network by...

Please sign up or login with your details

Forgot password? Click here to reset