ViT-CX: Causal Explanation of Vision Transformers

11/06/2022
by   Weiyan Xie, et al.
0

Despite the popularity of Vision Transformers (ViTs) and eXplainable AI (XAI), only a few explanation methods have been proposed for ViTs thus far. They use attention weights of the classification token on patch embeddings and often produce unsatisfactory saliency maps. In this paper, we propose a novel method for explaining ViTs called ViT-CX. It is based on patch embeddings, rather than attentions paid to them, and their causal impacts on the model output. ViT-CX can be used to explain different ViT models. Empirical results show that, in comparison with previous methods, ViT-CX produces more meaningful saliency maps and does a better job at revealing all the important evidence for prediction. It is also significantly more faithful to the model as measured by deletion AUC and insertion AUC.

READ FULL TEXT

page 2

page 3

page 4

page 7

research
10/12/2020

The elephant in the interpretability room: Why use attention as explanation when we have saliency methods?

There is a recent surge of interest in using attention as explanation of...
research
05/05/2023

Human Attention-Guided Explainable Artificial Intelligence for Computer Vision Models

We examined whether embedding human attention knowledge into saliency-ba...
research
08/22/2019

Saliency Methods for Explaining Adversarial Attacks

In this work, we aim to explain the classifications of adversary images ...
research
04/01/2023

Vision Transformers with Mixed-Resolution Tokenization

Vision Transformer models process input images by dividing them into a s...
research
07/11/2021

One Map Does Not Fit All: Evaluating Saliency Map Explanation on Multi-Modal Medical Images

Being able to explain the prediction to clinical end-users is a necessit...
research
03/26/2021

Building Reliable Explanations of Unreliable Neural Networks: Locally Smoothing Perspective of Model Interpretation

We present a novel method for reliably explaining the predictions of neu...
research
05/06/2022

GlobEnc: Quantifying Global Token Attribution by Incorporating the Whole Encoder Layer in Transformers

There has been a growing interest in interpreting the underlying dynamic...

Please sign up or login with your details

Forgot password? Click here to reset