Learning to Estimate Shapley Values with Vision Transformers

06/10/2022
by   Ian Covert, et al.
0

Transformers have become a default architecture in computer vision, but understanding what drives their predictions remains a challenging problem. Current explanation approaches rely on attention values or input gradients, but these give a limited understanding of a model's dependencies. Shapley values offer a theoretically sound alternative, but their computational cost makes them impractical for large, high-dimensional models. In this work, we aim to make Shapley values practical for vision transformers (ViTs). To do so, we first leverage an attention masking approach to evaluate ViTs with partial information, and we then develop a procedure for generating Shapley value explanations via a separate, learned explainer model. Our experiments compare Shapley values to many baseline methods (e.g., attention rollout, GradCAM, LRP), and we find that our approach provides more accurate explanations than any existing method for ViTs.

READ FULL TEXT

page 27

page 28

page 29

page 30

page 31

page 32

page 33

page 34

research
01/20/2023

Holistically Explainable Vision Transformers

Transformers increasingly dominate the machine learning landscape across...
research
04/12/2023

Towards Evaluating Explanations of Vision Transformers for Medical Imaging

As deep learning models increasingly find applications in critical domai...
research
06/01/2022

Fair Comparison between Efficient Attentions

Transformers have been successfully used in various fields and are becom...
research
02/15/2022

XAI for Transformers: Better Explanations through Conservative Propagation

Transformers have become an important workhorse of machine learning, wit...
research
06/07/2022

Signal Propagation in Transformers: Theoretical Perspectives and the Role of Rank Collapse

Transformers have achieved remarkable success in several domains, rangin...
research
12/13/2022

What do Vision Transformers Learn? A Visual Exploration

Vision transformers (ViTs) are quickly becoming the de-facto architectur...
research
08/30/2023

Learning Diverse Features in Vision Transformers for Improved Generalization

Deep learning models often rely only on a small set of features even whe...

Please sign up or login with your details

Forgot password? Click here to reset