One Neuron to Fool Them All

03/20/2020
by   Anshuman Suri, et al.
22

Despite vast research in adversarial examples, the root causes of model susceptibility are not well understood. Instead of looking at attack-specific robustness, we propose a notion that evaluates the sensitivity of individual neurons in terms of how robust the model's output is to direct perturbations of that neuron's output. Analyzing models from this perspective reveals distinctive characteristics of standard as well as adversarially-trained robust models, and leads to several curious results. In our experiments on CIFAR-10 and ImageNet, we find that attacks using a loss function that targets just a single sensitive neuron find adversarial examples nearly as effectively as ones that target the full model. We analyze the properties of these sensitive neurons to propose a regularization term that can help a model achieve robustness to a variety of different perturbation constraints while maintaining accuracy on natural data distributions. Code for all our experiments is available at https://github.com/iamgroot42/sauron .

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/16/2019

Interpreting and Improving Adversarial Robustness with Neuron Sensitivity

Deep neural networks (DNNs) are vulnerable to adversarial examples where...
research
10/20/2022

Multitasking Models are Robust to Structural Failure: A Neural Model for Bilingual Cognitive Reserve

We find a surprising connection between multitask learning and robustnes...
research
10/09/2022

Pruning Adversarially Robust Neural Networks without Adversarial Examples

Adversarial pruning compresses models while preserving robustness. Curre...
research
04/22/2020

Adversarial examples and where to find them

Adversarial robustness of trained models has attracted considerable atte...
research
10/05/2020

Second-Order NLP Adversarial Examples

Adversarial example generation methods in NLP rely on models like langua...
research
06/12/2020

D-square-B: Deep Distribution Bound for Natural-looking Adversarial Attack

We propose a novel technique that can generate natural-looking adversari...
research
06/06/2021

A Primer on Multi-Neuron Relaxation-based Adversarial Robustness Certification

The existence of adversarial examples poses a real danger when deep neur...

Please sign up or login with your details

Forgot password? Click here to reset