Improving Adversarial Transferability via Neuron Attribution-Based Attacks

by   Jianping Zhang, et al.

Deep neural networks (DNNs) are known to be vulnerable to adversarial examples. It is thus imperative to devise effective attack algorithms to identify the deficiencies of DNNs beforehand in security-sensitive applications. To efficiently tackle the black-box setting where the target model's particulars are unknown, feature-level transfer-based attacks propose to contaminate the intermediate feature outputs of local models, and then directly employ the crafted adversarial samples to attack the target model. Due to the transferability of features, feature-level attacks have shown promise in synthesizing more transferable adversarial samples. However, existing feature-level attacks generally employ inaccurate neuron importance estimations, which deteriorates their transferability. To overcome such pitfalls, in this paper, we propose the Neuron Attribution-based Attack (NAA), which conducts feature-level attacks with more accurate neuron importance estimations. Specifically, we first completely attribute a model's output to each neuron in a middle layer. We then derive an approximation scheme of neuron attribution to tremendously reduce the computation overhead. Finally, we weight neurons based on their attribution results and launch feature-level attacks. Extensive experiments confirm the superiority of our approach to the state-of-the-art benchmarks.



page 1


Enhancing Adversarial Example Transferability with an Intermediate Level Attack

Neural networks are vulnerable to adversarial examples, malicious inputs...

Explaining Neural Networks via Perturbing Important Learned Features

Attributing the output of a neural network to the contribution of given ...

DEPARA: Deep Attribution Graph for Deep Knowledge Transferability

Exploring the intrinsic interconnections between the knowledge encoded i...

Enhance transferability of adversarial examples with model architecture

Transferability of adversarial examples is of critical importance to lau...

Smoothed Geometry for Robust Attribution

Feature attributions are a popular tool for explaining the behavior of D...

An Intermediate-level Attack Framework on The Basis of Linear Regression

This paper substantially extends our work published at ECCV, in which an...

A Rigorous Study of Integrated Gradients Method and Extensions to Internal Neuron Attributions

As the efficacy of deep learning (DL) grows, so do concerns about the la...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.