Verifying Attention Robustness of Deep Neural Networks against Semantic Perturbations

07/13/2022
by   Satoshi Munakata, et al.
6

It is known that deep neural networks (DNNs) classify an input image by paying particular attention to certain specific pixels; a graphical representation of the magnitude of attention to each pixel is called a saliency-map. Saliency-maps are used to check the validity of the classification decision basis, e.g., it is not a valid basis for classification if a DNN pays more attention to the background rather than the subject of an image. Semantic perturbations can significantly change the saliency-map. In this work, we propose the first verification method for attention robustness, i.e., the local robustness of the changes in the saliency-map against combinations of semantic perturbations. Specifically, our method determines the range of the perturbation parameters (e.g., the brightness change) that maintains the difference between the actual saliency-map change and the expected saliency-map change below a given threshold value. Our method is based on activation region traversals, focusing on the outermost robust boundary for scalability on larger DNNs. Experimental results demonstrate that our method can show the extent to which DNNs can classify with the same basis regardless of semantic perturbations and report on performance and performance factors of activation region traversals.

READ FULL TEXT

page 4

page 7

page 13

page 23

research
01/26/2021

Evaluating Input Perturbation Methods for Interpreting CNNs and Saliency Map Comparison

Input perturbation methods occlude parts of an input to a function and m...
research
01/27/2023

OccRob: Efficient SMT-Based Occlusion Robustness Verification of Deep Neural Networks

Occlusion is a prevalent and easily realizable semantic perturbation to ...
research
11/21/2020

Backdoor Attacks on the DNN Interpretation System

Interpretability is crucial to understand the inner workings of deep neu...
research
07/22/2022

Training Certifiably Robust Neural Networks Against Semantic Perturbations

Semantic image perturbations, such as scaling and rotation, have been sh...
research
07/08/2022

Abs-CAM: A Gradient Optimization Interpretable Approach for Explanation of Convolutional Neural Networks

The black-box nature of Deep Neural Networks (DNNs) severely hinders its...
research
01/21/2022

Conceptor Learning for Class Activation Mapping

Class Activation Mapping (CAM) has been widely adopted to generate salie...
research
05/10/2019

On the Connection Between Adversarial Robustness and Saliency Map Interpretability

Recent studies on the adversarial vulnerability of neural networks have ...

Please sign up or login with your details

Forgot password? Click here to reset