Sequential Integrated Gradients: a simple but effective method for explaining language models

05/25/2023
by   Joseph Enguehard, et al.
0

Several explanation methods such as Integrated Gradients (IG) can be characterised as path-based methods, as they rely on a straight line between the data and an uninformative baseline. However, when applied to language models, these methods produce a path for each word of a sentence simultaneously, which could lead to creating sentences from interpolated words either having no clear meaning, or having a significantly different meaning compared to the original sentence. In order to keep the meaning of these sentences as close as possible to the original one, we propose Sequential Integrated Gradients (SIG), which computes the importance of each word in a sentence by keeping fixed every other words, only creating interpolations between the baseline and the word of interest. Moreover, inspired by the training procedure of several language models, we also propose to replace the baseline token "pad" with the trained token "mask". While being a simple improvement over the original IG method, we show on various models and datasets that SIG proves to be a very effective method for explaining language models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/23/2023

Sentence Simplification via Large Language Models

Sentence Simplification aims to rephrase complex sentences into simpler ...
research
08/31/2021

Discretized Integrated Gradients for Explaining Language Models

As a prominent attribution-based explanation algorithm, Integrated Gradi...
research
08/23/2018

The Importance of Generation Order in Language Modeling

Neural language models are a critical component of state-of-the-art syst...
research
10/15/2022

Temporal Word Meaning Disambiguation using TimeLMs

Meaning of words constantly changes given the events in modern civilizat...
research
03/11/2019

Partially Shuffling the Training Data to Improve Language Models

Although SGD requires shuffling the training data between epochs, curren...
research
12/20/2022

Measure More, Question More: Experimental Studies on Transformer-based Language Models and Complement Coercion

Transformer-based language models have shown strong performance on an ar...
research
10/22/2021

Double Trouble: How to not explain a text classifier's decisions using counterfactuals synthesized by masked language models?

Explaining how important each input feature is to a classifier's decisio...

Please sign up or login with your details

Forgot password? Click here to reset