Feature-Filter: Detecting Adversarial Examples through Filtering off Recessive Features

07/19/2021
by   Hui Liu, et al.
14

Deep neural networks (DNNs) are under threat from adversarial example attacks. The adversary can easily change the outputs of DNNs by adding small well-designed perturbations to inputs. Adversarial example detection is a fundamental work for robust DNNs-based service. Adversarial examples show the difference between humans and DNNs in image recognition. From a human-centric perspective, image features could be divided into dominant features that are comprehensible to humans, and recessive features that are incomprehensible to humans, yet are exploited by DNNs. In this paper, we reveal that imperceptible adversarial examples are the product of recessive features misleading neural networks, and an adversarial attack is essentially a kind of method to enrich these recessive features in the image. The imperceptibility of the adversarial examples indicates that the perturbations enrich recessive features, yet hardly affect dominant features. Therefore, adversarial examples are sensitive to filtering off recessive features, while benign examples are immune to such operation. Inspired by this idea, we propose a label-only adversarial detection approach that is referred to as feature-filter. Feature-filter utilizes discrete cosine transform to approximately separate recessive features from dominant features, and gets a mutant image that is filtered off recessive features. By only comparing DNN's prediction labels on the input and its mutant, feature-filter can real-time detect imperceptible adversarial examples at high accuracy and few false positives.

READ FULL TEXT

page 1

page 2

page 6

page 9

research
04/04/2017

Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks

Although deep neural networks (DNNs) have achieved great success in many...
research
06/30/2022

Detecting and Recovering Adversarial Examples from Extracting Non-robust and Highly Predictive Adversarial Perturbations

Deep neural networks (DNNs) have been shown to be vulnerable against adv...
research
11/14/2019

CAGFuzz: Coverage-Guided Adversarial Generative Fuzzing Testing of Deep Learning Systems

Deep Learning systems (DL) based on Deep Neural Networks (DNNs) are more...
research
05/31/2021

Dominant Patterns: Critical Features Hidden in Deep Neural Networks

In this paper, we find the existence of critical features hidden in Deep...
research
11/05/2019

DLA: Dense-Layer-Analysis for Adversarial Example Detection

In recent years Deep Neural Networks (DNNs) have achieved remarkable res...
research
10/09/2018

Analyzing the Noise Robustness of Deep Neural Networks

Deep neural networks (DNNs) are vulnerable to maliciously generated adve...
research
05/31/2022

Exact Feature Collisions in Neural Networks

Predictions made by deep neural networks were shown to be highly sensiti...

Please sign up or login with your details

Forgot password? Click here to reset