Towards Defending against Adversarial Examples via Attack-Invariant Features

06/09/2021
by   Dawei Zhou, et al.
0

Deep neural networks (DNNs) are vulnerable to adversarial noise. Their adversarial robustness can be improved by exploiting adversarial examples. However, given the continuously evolving attacks, models trained on seen types of adversarial examples generally cannot generalize well to unseen types of adversarial examples. To solve this problem, in this paper, we propose to remove adversarial noise by learning generalizable invariant features across attacks which maintain semantic classification information. Specifically, we introduce an adversarial feature learning mechanism to disentangle invariant features from adversarial noise. A normalization term has been proposed in the encoded space of the attack-invariant features to address the bias issue between the seen and unseen types of attacks. Empirical evaluations demonstrate that our method could provide better protection in comparison to previous state-of-the-art approaches, especially against unseen types of attacks and adaptive attacks.

READ FULL TEXT

page 1

page 3

page 5

research
04/19/2021

Removing Adversarial Noise in Class Activation Feature Space

Deep neural networks (DNNs) are vulnerable to adversarial noise. Preproc...
research
06/08/2020

A Self-supervised Approach for Adversarial Robustness

Adversarial examples can cause catastrophic mistakes in Deep Neural Netw...
research
12/31/2022

Tracing the Origin of Adversarial Attack for Forensic Investigation and Deterrence

Deep neural networks are vulnerable to adversarial attacks. In this pape...
research
03/11/2023

Improving the Robustness of Deep Convolutional Neural Networks Through Feature Learning

Deep convolutional neural network (DCNN for short) models are vulnerable...
research
10/28/2022

Improving Hyperspectral Adversarial Robustness using Ensemble Networks in the Presences of Multiple Attacks

Semantic segmentation of hyperspectral images (HSI) has seen great strid...
research
07/21/2022

Careful What You Wish For: on the Extraction of Adversarially Trained Models

Recent attacks on Machine Learning (ML) models such as evasion attacks w...
research
11/18/2016

LOTS about Attacking Deep Features

Deep neural networks provide state-of-the-art performance on various tas...

Please sign up or login with your details

Forgot password? Click here to reset