Transferable Perturbations of Deep Feature Distributions

04/27/2020
by   Nathan Inkawhich, et al.
0

Almost all current adversarial attacks of CNN classifiers rely on information derived from the output layer of the network. This work presents a new adversarial attack based on the modeling and exploitation of class-wise and layer-wise deep feature distributions. We achieve state-of-the-art targeted blackbox transfer-based attack results for undefended ImageNet models. Further, we place a priority on explainability and interpretability of the attacking process. Our methodology affords an analysis of how adversarial attacks change the intermediate feature distributions of CNNs, as well as a measure of layer-wise and class-wise feature distributional separability/entanglement. We also conceptualize a transition from task/data-specific to model-specific features within a CNN architecture that directly impacts the transferability of adversarial examples.

READ FULL TEXT

page 7

page 14

research
06/01/2022

On the reversibility of adversarial attacks

Adversarial attacks modify images with perturbations that change the pre...
research
04/20/2020

Headless Horseman: Adversarial Attacks on Transfer Learning Models

Transfer learning facilitates the training of task-specific classifiers ...
research
01/26/2021

The Effect of Class Definitions on the Transferability of Adversarial Attacks Against Forensic CNNs

In recent years, convolutional neural networks (CNNs) have been widely u...
research
09/29/2018

CAAD 2018: Generating Transferable Adversarial Examples

Deep neural networks (DNNs) are vulnerable to adversarial examples, pert...
research
08/24/2022

Trace and Detect Adversarial Attacks on CNNs using Feature Response Maps

The existence of adversarial attacks on convolutional neural networks (C...
research
10/03/2020

A Deep Genetic Programming based Methodology for Art Media Classification Robust to Adversarial Perturbations

Art Media Classification problem is a current research area that has att...
research
05/25/2021

Feature Space Targeted Attacks by Statistic Alignment

By adding human-imperceptible perturbations to images, DNNs can be easil...

Please sign up or login with your details

Forgot password? Click here to reset