Evaluating self-attention interpretability through human-grounded experimental protocol

03/27/2023
by   Milan Bhan, et al.
0

Attention mechanisms have played a crucial role in the development of complex architectures such as Transformers in natural language processing. However, Transformers remain hard to interpret and are considered as black-boxes. This paper aims to assess how attention coefficients from Transformers can help in providing interpretability. A new attention-based interpretability method called CLaSsification-Attention (CLS-A) is proposed. CLS-A computes an interpretability score for each word based on the attention coefficient distribution related to the part specific to the classification task within the Transformer architecture. A human-grounded experiment is conducted to evaluate and compare CLS-A to other interpretability methods. The experimental protocol relies on the capacity of an interpretability method to provide explanation in line with human reasoning. Experiment design includes measuring reaction times and correct response rates by human subjects. CLS-A performs comparably to usual interpretability methods regarding average participant reaction time and accuracy. The lower computational cost of CLS-A compared to other interpretability methods and its availability by design within the classifier make it particularly interesting. Data analysis also highlights the link between the probability score of a classifier prediction and adequate explanations. Finally, our work confirms the relevancy of the use of CLS-A and shows to which extent self-attention contains rich information to explain Transformer classifiers.

READ FULL TEXT

page 4

page 10

page 16

page 17

research
09/22/2022

Improving Attention-Based Interpretability of Text Classification Transformers

Transformers are widely used in NLP, where they consistently achieve sta...
research
04/14/2023

Optimal inference of a generalised Potts model by single-layer transformers with factored attention

Transformers are the type of neural networks that has revolutionised nat...
research
11/13/2022

Demystify Self-Attention in Vision Transformers from a Semantic Perspective: Analysis and Application

Self-attention mechanisms, especially multi-head self-attention (MSA), h...
research
12/10/2021

Human Interpretation and Exploitation of Self-attention Patterns in Transformers: A Case Study in Extractive Summarization

The transformer multi-head self-attention mechanism has been thoroughly ...
research
02/18/2023

Transformadores: Fundamentos teoricos y Aplicaciones

Transformers are a neural network architecture originally designed for n...
research
05/25/2023

Concept-Centric Transformers: Concept Transformers with Object-Centric Concept Learning for Interpretability

Attention mechanisms have greatly improved the performance of deep-learn...
research
10/31/2022

What is my math transformer doing? – Three results on interpretability and generalization

This paper investigates the failure cases and out-of-distribution behavi...

Please sign up or login with your details

Forgot password? Click here to reset