On Attribution of Recurrent Neural Network Predictions via Additive Decomposition

03/27/2019
by   Mengnan Du, et al.
0

RNN models have achieved the state-of-the-art performance in a wide range of text mining tasks. However, these models are often regarded as black-boxes and are criticized due to the lack of interpretability. In this paper, we enhance the interpretability of RNNs by providing interpretable rationales for RNN predictions. Nevertheless, interpreting RNNs is a challenging problem. Firstly, unlike existing methods that rely on local approximation, we aim to provide rationales that are more faithful to the decision making process of RNN models. Secondly, a flexible interpretation method should be able to assign contribution scores to text segments of varying lengths, instead of only to individual words. To tackle these challenges, we propose a novel attribution method, called REAT, to provide interpretations to RNN predictions. REAT decomposes the final prediction of a RNN into additive contribution of each word in the input text. This additive decomposition enables REAT to further obtain phrase-level attribution scores. In addition, REAT is generally applicable to various RNN architectures, including GRU, LSTM and their bidirectional versions. Experimental results demonstrate the faithfulness and interpretability of the proposed attribution method. Comprehensive analysis shows that our attribution method could unveil the useful linguistic knowledge captured by RNNs. Some analysis further demonstrates our method could be utilized as a debugging tool to examine the vulnerability and failure reasons of RNNs, which may lead to several promising future directions to promote generalization ability of RNNs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/23/2020

Visual Summary of Value-level Feature Attribution in Prediction Classes with Recurrent Neural Networks

Deep Recurrent Neural Networks (RNN) is increasingly used in decision-ma...
research
05/28/2021

A General Taylor Framework for Unifying and Revisiting Attribution Methods

Attribution methods provide an insight into the decision-making process ...
research
12/13/2020

MEME: Generating RNN Model Explanations via Model Extraction

Recurrent Neural Networks (RNNs) have achieved remarkable performance on...
research
05/28/2018

RetainVis: Visual Analytics with Interpretable and Interactive Recurrent Neural Networks on Electronic Medical Records

In the past decade, we have seen many successful applications of recurre...
research
01/17/2019

Visual Reasoning of Feature Attribution with Deep Recurrent Neural Networks

Deep Recurrent Neural Network (RNN) has gained popularity in many sequen...
research
03/02/2023

DeepSeer: Interactive RNN Explanation and Debugging via State Abstraction

Recurrent Neural Networks (RNNs) have been widely used in Natural Langua...
research
05/23/2022

Gradient Hedging for Intensively Exploring Salient Interpretation beyond Neuron Activation

Hedging is a strategy for reducing the potential risks in various types ...

Please sign up or login with your details

Forgot password? Click here to reset