NIP: Neuron-level Inverse Perturbation Against Adversarial Attacks

12/24/2021
by   Ruoxi Chen, et al.
16

Although deep learning models have achieved unprecedented success, their vulnerabilities towards adversarial attacks have attracted increasing attention, especially when deployed in security-critical domains. To address the challenge, numerous defense strategies, including reactive and proactive ones, have been proposed for robustness improvement. From the perspective of image feature space, some of them cannot reach satisfying results due to the shift of features. Besides, features learned by models are not directly related to classification results. Different from them, We consider defense method essentially from model inside and investigated the neuron behaviors before and after attacks. We observed that attacks mislead the model by dramatically changing the neurons that contribute most and least to the correct label. Motivated by it, we introduce the concept of neuron influence and further divide neurons into front, middle and tail part. Based on it, we propose neuron-level inverse perturbation(NIP), the first neuron-level reactive defense method against adversarial attacks. By strengthening front neurons and weakening those in the tail part, NIP can eliminate nearly all adversarial perturbations while still maintaining high benign accuracy. Besides, it can cope with different sizes of perturbations via adaptivity, especially larger ones. Comprehensive experiments conducted on three datasets and six models show that NIP outperforms the state-of-the-art baselines against eleven adversarial attacks. We further provide interpretable proofs via neuron activation and visualization for better understanding.

READ FULL TEXT

page 1

page 3

page 11

page 13

page 14

research
07/13/2022

Perturbation Inactivation Based Adversarial Defense for Face Recognition

Deep learning-based face recognition models are vulnerable to adversaria...
research
03/06/2023

Visual Analytics of Neuron Vulnerability to Adversarial Attacks on Convolutional Neural Networks

Adversarial attacks on a convolutional neural network (CNN) – injecting ...
research
12/01/2018

FineFool: Fine Object Contour Attack via Attention

Machine learning models have been shown vulnerable to adversarial attack...
research
02/23/2020

Neuron Shapley: Discovering the Responsible Neurons

We develop Neuron Shapley as a new framework to quantify the contributio...
research
05/14/2021

Salient Feature Extractor for Adversarial Defense on Deep Neural Networks

Recent years have witnessed unprecedented success achieved by deep learn...
research
06/16/2020

On sparse connectivity, adversarial robustness, and a novel model of the artificial neuron

Deep neural networks have achieved human-level accuracy on almost all pe...
research
02/15/2021

And/or trade-off in artificial neurons: impact on adversarial robustness

Since its discovery in 2013, the phenomenon of adversarial examples has ...

Please sign up or login with your details

Forgot password? Click here to reset