Robust saliency maps with decoy-enhanced saliency score

02/03/2020
by   Yang Lu, et al.
7

Saliency methods help to make deep neural network predictions more interpretable by identifying particular features, such as pixels in an image, that contribute most strongly to the network's prediction. Unfortunately, recent evidence suggests that many saliency methods perform poorly when gradients are saturated or in the presence of strong inter-feature dependence or noise injected by an adversarial attack. In this work, we propose to infer robust saliency scores by integrating the saliency scores of a set of decoys with a novel decoy-enhanced saliency score, in which the decoys are generated by either solving an optimization problem or blurring the original input. We theoretically analyze that our method compensates for gradient saturation and considers joint activation patterns of pixels. We also apply our method to three different CNNs—VGGNet, AlexNet, and ResNet trained on ImageNet data set. The empirical results show both qualitatively and quantitatively that our method outperforms raw scores produced by three existing saliency methods, even in the presence of adversarial attacks.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 6

page 8

page 9

research
11/21/2020

Backdoor Attacks on the DNN Interpretation System

Interpretability is crucial to understand the inner workings of deep neu...
research
06/10/2021

Deep neural network loses attention to adversarial images

Adversarial algorithms have shown to be effective against neural network...
research
06/15/2021

Explaining decision of model from its prediction

This document summarizes different visual explanations methods such as C...
research
06/29/2020

Scaling Symbolic Methods using Gradients for Neural Model Explanation

Symbolic techniques based on Satisfiability Modulo Theory (SMT) solvers ...
research
05/27/2019

A Simple Saliency Method That Passes the Sanity Checks

There is great interest in *saliency methods* (also called *attribution ...
research
10/19/2019

NormGrad: Finding the Pixels that Matter for Training

The different families of saliency methods, either based on contrastive ...
research
09/29/2020

Trustworthy Convolutional Neural Networks: A Gradient Penalized-based Approach

Convolutional neural networks (CNNs) are commonly used for image classif...

Please sign up or login with your details

Forgot password? Click here to reset