Evaluating Attribution Methods using White-Box LSTMs

10/16/2020
by   Yiding Hao, et al.
0

Interpretability methods for neural networks are difficult to evaluate because we do not understand the black-box models typically used to test them. This paper proposes a framework in which interpretability methods are evaluated using manually constructed networks, which we call white-box networks, whose behavior is understood a priori. We evaluate five methods for producing attribution heatmaps by applying them to white-box LSTM classifiers for tasks based on formal languages. Although our white-box classifiers solve their tasks perfectly and transparently, we find that all five attribution methods fail to produce the expected model explanations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/13/2022

Making Sense of Dependence: Efficient Black-box Explanations Using Dependence Measure

This paper presents a new efficient black-box attribution method based o...
research
08/18/2023

On Gradient-like Explanation under a Black-box Setting: When Black-box Explanations Become as Good as White-box

Attribution methods shed light on the explainability of data-driven appr...
research
02/23/2022

Training Characteristic Functions with Reinforcement Learning: XAI-methods play Connect Four

One of the goals of Explainable AI (XAI) is to determine which input com...
research
07/06/2023

A Vulnerability of Attribution Methods Using Pre-Softmax Scores

We discuss a vulnerability involving a category of attribution methods u...
research
12/22/2022

Impossibility Theorems for Feature Attribution

Despite a sea of interpretability methods that can produce plausible exp...
research
07/03/2023

Fixing confirmation bias in feature attribution methods via semantic match

Feature attribution methods have become a staple method to disentangle t...
research
05/10/2022

White-box Testing of NLP models with Mask Neuron Coverage

Recent literature has seen growing interest in using black-box strategie...

Please sign up or login with your details

Forgot password? Click here to reset