Evaluating Explanations for Reading Comprehension with Realistic Counterfactuals

04/09/2021 ∙ by Xi Ye, et al. ∙ 0

Token-level attributions have been extensively studied to explain model predictions for a wide range of classification tasks in NLP (e.g., sentiment analysis), but such explanation techniques are less explored for machine reading comprehension (RC) tasks. Although the transformer-based models used here are identical to those used for classification, the underlying reasoning these models perform is very different and different types of explanations are required. We propose a methodology to evaluate explanations: an explanation should allow us to understand the RC model's high-level behavior with respect to a set of realistic counterfactual input scenarios. We define these counterfactuals for several RC settings, and by connecting explanation techniques' outputs to high-level model behavior, we can evaluate how useful different explanations really are. Our analysis suggests that pairwise explanation techniques are better suited to RC than token-level attributions, which are often unfaithful in the scenarios we consider. We additionally propose an improvement to an attention-based attribution technique, resulting in explanations which better reveal the model's behavior.



There are no comments yet.


page 16

page 17

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.