Calibrating Trust of Multi-Hop Question Answering Systems with Decompositional Probes

by   Kaige Xie, et al.

Multi-hop Question Answering (QA) is a challenging task since it requires an accurate aggregation of information from multiple context paragraphs and a thorough understanding of the underlying reasoning chains. Recent work in multi-hop QA has shown that performance can be boosted by first decomposing the questions into simpler, single-hop questions. In this paper, we explore one additional utility of the multi-hop decomposition from the perspective of explainable NLP: to create explanation by probing a neural QA model with them. We hypothesize that in doing so, users will be better able to construct a mental model of when the underlying QA system will give the correct answer. Through human participant studies, we verify that exposing the decomposition probes and answers to the probes to users can increase their ability to predict system performance on a question instance basis. We show that decomposition is an effective form of probing QA systems as well as a promising approach to explanation generation. In-depth analyses show the need for improvements in decomposition systems.


page 1

page 2

page 3

page 4


Interpretable AMR-Based Question Decomposition for Multi-hop Question Answering

Effective multi-hop question answering (QA) requires reasoning over mult...

A Survey on Multi-hop Question Answering and Generation

The problem of Question Answering (QA) has attracted significant researc...

A Road-map Towards Explainable Question Answering A Solution for Information Pollution

The increasing rate of information pollution on the Web requires novel s...

Text Modular Networks: Learning to Decompose Tasks in the Language of Existing Models

A common approach to solve complex tasks is by breaking them down into s...

Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies

A key limitation in current datasets for multi-hop reasoning is that the...

Best of Both Worlds: A Hybrid Approach for Multi-Hop Explanation with Declarative Facts

Language-enabled AI systems can answer complex, multi-hop questions to h...

F1 is Not Enough! Models and Evaluation Towards User-Centered Explainable Question Answering

Explainable question answering systems predict an answer together with a...