DeepAI AI Chat
Log In Sign Up

Rethinking Attention-Model Explainability through Faithfulness Violation Test

by   Yibing Liu, et al.

Attention mechanisms are dominating the explainability of deep models. They produce probability distributions over the input, which are widely deemed as feature-importance indicators. However, in this paper, we find one critical limitation in attention explanations: weakness in identifying the polarity of feature impact. This would be somehow misleading – features with higher attention weights may not faithfully contribute to model predictions; instead, they can impose suppression effects. With this finding, we reflect on the explainability of current attention-based techniques, such as Attentio⊙Gradient and LRP-based attention explanations. We first propose an actionable diagnostic methodology (henceforth faithfulness violation test) to measure the consistency between explanation weights and the impact polarity. Through the extensive experiments, we then show that most tested explanation methods are unexpectedly hindered by the faithfulness violation issue, especially the raw attention. Empirical analyses on the factors affecting violation issues further provide useful observations for adopting explanation methods in attention models.


Attention cannot be an Explanation

Attention based explanations (viz. saliency maps), by providing interpre...

Order in the Court: Explainable AI Methods Prone to Disagreement

In Natural Language Processing, feature-additive explanation methods qua...

Multi-Layer Attention-Based Explainability via Transformers for Tabular Data

We propose a graph-oriented attention-based explainability method for ta...

Attention is not not Explanation

Attention mechanisms play a central role in NLP systems, especially with...

Towards Prediction Explainability through Sparse Communication

Explainability is a topic of growing importance in NLP. In this work, we...

Attention vs non-attention for a Shapley-based explanation method

The field of explainable AI has recently seen an explosion in the number...

Model Explanations under Calibration

Explaining and interpreting the decisions of recommender systems are bec...