Evaluating Saliency Methods for Neural Language Models

04/12/2021
by   Shuoyang Ding, et al.
0

Saliency methods are widely used to interpret neural network predictions, but different variants of saliency methods often disagree even on the interpretations of the same prediction made by the same model. In these cases, how do we identify when are these interpretations trustworthy enough to be used in analyses? To address this question, we conduct a comprehensive and quantitative evaluation of saliency methods on a fundamental category of NLP models: neural language models. We evaluate the quality of prediction interpretations from two perspectives that each represents a desirable property of these interpretations: plausibility and faithfulness. Our evaluation is conducted on four different datasets constructed from the existing human annotation of syntactic and semantic agreements, on both sentence-level and document-level. Through our evaluation, we identified various ways saliency methods could yield interpretations of low quality. We recommend that future work deploying such methods to neural language models should carefully validate their interpretations before drawing insights.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/31/2016

Quantitative Analysis of Saliency Models

Previous saliency detection research required the reader to evaluate per...
research
06/08/2021

On the Lack of Robust Interpretability of Neural Text Classifiers

With the ever-increasing complexity of neural language models, practitio...
research
08/09/2023

Decoding Layer Saliency in Language Transformers

In this paper, we introduce a strategy for identifying textual saliency ...
research
06/07/2021

Relative Importance in Sentence Processing

Determining the relative importance of the elements in a sentence is a k...
research
10/29/2017

Interpretation of Neural Networks is Fragile

In order for machine learning to be deployed and trusted in many applica...
research
03/02/2021

Have We Learned to Explain?: How Interpretability Methods Can Learn to Encode Predictions in their Interpretations

While the need for interpretable machine learning has been established, ...
research
07/12/2022

Rethinking gradient weights' influence over saliency map estimation

Class activation map (CAM) helps to formulate saliency maps that aid in ...

Please sign up or login with your details

Forgot password? Click here to reset