"Will You Find These Shortcuts?" A Protocol for Evaluating the Faithfulness of Input Salience Methods for Text Classification

11/14/2021
by   Jasmijn Bastings, et al.
5

Feature attribution a.k.a. input salience methods which assign an importance score to a feature are abundant but may produce surprisingly different results for the same model on the same input. While differences are expected if disparate definitions of importance are assumed, most methods claim to provide faithful attributions and point at the features most relevant for a model's prediction. Existing work on faithfulness evaluation is not conclusive and does not provide a clear answer as to how different methods are to be compared. Focusing on text classification and the model debugging scenario, our main contribution is a protocol for faithfulness evaluation that makes use of partially synthetic data to obtain ground truth for feature importance ranking. Following the protocol, we do an in-depth analysis of four standard salience method classes on a range of datasets and shortcuts for BERT and LSTM models and demonstrate that some of the most popular method configurations provide poor results even for simplest shortcuts. We recommend following the protocol for each new task and model combination to find the best method for identifying shortcuts.

READ FULL TEXT

page 2

page 8

research
10/18/2019

Many Faces of Feature Importance: Comparing Built-in and Post-hoc Feature Importance in Text Classification

Feature importance is commonly used to explain machine predictions. Whil...
research
10/28/2020

A Chinese Text Classification Method With Low Hardware Requirement Based on Improved Model Concatenation

In order to improve the accuracy performance of Chinese text classificat...
research
04/05/2023

ECG Feature Importance Rankings: Cardiologists vs. Algorithms

Feature importance methods promise to provide a ranking of features acco...
research
04/16/2021

Variable Instance-Level Explainability for Text Classification

Despite the high accuracy of pretrained transformer networks in text cla...
research
04/26/2021

Towards Rigorous Interpretations: a Formalisation of Feature Attribution

Feature attribution is often loosely presented as the process of selecti...
research
03/18/2021

Neural Network Attribution Methods for Problems in Geoscience: A Novel Synthetic Benchmark Dataset

Despite the increasingly successful application of neural networks to ma...
research
02/24/2023

Don't be fooled: label leakage in explanation methods and the importance of their quantitative evaluation

Feature attribution methods identify which features of an input most inf...

Please sign up or login with your details

Forgot password? Click here to reset