Are Interpretations Fairly Evaluated? A Definition Driven Pipeline for Post-Hoc Interpretability

09/16/2020
by   Ninghao Liu, et al.
0

Recent years have witnessed an increasing number of interpretation methods being developed for improving transparency of NLP models. Meanwhile, researchers also try to answer the question that whether the obtained interpretation is faithful in explaining mechanisms behind model prediction? Specifically, (Jain and Wallace, 2019) proposes that "attention is not explanation" by comparing attention interpretation with gradient alternatives. However, it raises a new question that can we safely pick one interpretation method as the ground-truth? If not, on what basis can we compare different interpretation methods? In this work, we propose that it is crucial to have a concrete definition of interpretation before we could evaluate faithfulness of an interpretation. The definition will affect both the algorithm to obtain interpretation and, more importantly, the metric used in evaluation. Through both theoretical and experimental analysis, we find that although interpretation methods perform differently under a certain evaluation metric, such a difference may not result from interpretation quality or faithfulness, but rather the inherent bias of the evaluation metric.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/07/2020

Towards Faithfully Interpretable NLP Systems: How should we define and evaluate faithfulness?

With the growing popularity of deep-learning based NLP models, comes a n...
research
04/12/2022

A Comparative Study of Faithfulness Metrics for Model Interpretability Methods

Interpretation methods to reveal the internal reasoning processes behind...
research
09/12/2021

The Logic Traps in Evaluating Post-hoc Interpretations

Post-hoc interpretation aims to explain a trained model and reveal how t...
research
01/30/2023

Evaluating Neuron Interpretation Methods of NLP Models

Neuron Interpretation has gained traction in the field of interpretabili...
research
05/17/2023

FICNN: A Framework for the Interpretation of Deep Convolutional Neural Networks

With the continue development of Convolutional Neural Networks (CNNs), t...
research
06/06/2009

On Defining 'I' "I logy"

Could we define I? Throughout this article we give a negative answer to ...
research
10/29/2021

Towards Comparative Physical Interpretation of Spatial Variability Aware Neural Networks: A Summary of Results

Given Spatial Variability Aware Neural Networks (SVANNs), the goal is to...

Please sign up or login with your details

Forgot password? Click here to reset