Removing Adversarial Noise in Class Activation Feature Space

04/19/2021
by   Dawei Zhou, et al.
0

Deep neural networks (DNNs) are vulnerable to adversarial noise. Preprocessing based defenses could largely remove adversarial noise by processing inputs. However, they are typically affected by the error amplification effect, especially in the front of continuously evolving attacks. To solve this problem, in this paper, we propose to remove adversarial noise by implementing a self-supervised adversarial training mechanism in a class activation feature space. To be specific, we first maximize the disruptions to class activation features of natural examples to craft adversarial examples. Then, we train a denoising model to minimize the distances between the adversarial examples and the natural examples in the class activation feature space. Empirical evaluations demonstrate that our method could significantly enhance adversarial robustness in comparison to previous state-of-the-art approaches, especially against unseen adversarial attacks and adaptive attacks.

READ FULL TEXT

page 1

page 3

page 5

page 8

research
06/09/2021

Towards Defending against Adversarial Examples via Attack-Invariant Features

Deep neural networks (DNNs) are vulnerable to adversarial noise. Their a...
research
06/08/2020

A Self-supervised Approach for Adversarial Robustness

Adversarial examples can cause catastrophic mistakes in Deep Neural Netw...
research
09/21/2021

Modelling Adversarial Noise for Adversarial Defense

Deep neural networks have been demonstrated to be vulnerable to adversar...
research
07/25/2022

Improving Adversarial Robustness via Mutual Information Estimation

Deep neural networks (DNNs) are found to be vulnerable to adversarial no...
research
06/10/2021

Improving White-box Robustness of Pre-processing Defenses via Joint Adversarial Training

Deep neural networks (DNNs) are vulnerable to adversarial noise. A range...
research
12/17/2020

A Hierarchical Feature Constraint to Camouflage Medical Adversarial Attacks

Deep neural networks (DNNs) for medical images are extremely vulnerable ...
research
06/26/2020

A Unified Framework for Analyzing and Detecting Malicious Examples of DNN Models

Deep Neural Networks are well known to be vulnerable to adversarial atta...

Please sign up or login with your details

Forgot password? Click here to reset