Measuring and Reducing Non-Multifact Reasoning in Multi-hop Question Answering

05/02/2020
by   Harsh Trivedi, et al.
0

The measurement of true progress in multihop question-answering has been muddled by the strong ability of models to exploit artifacts and other reasoning shortcuts. Models can produce the correct answer, and even independently identify the supporting facts, without necessarily connecting the information between the facts. This defeats the purpose of building multihop QA datasets. We make three contributions towards addressing this issue. First, we formalize this form of disconnected reasoning and propose contrastive support sufficiency as a better test of multifact reasoning. To this end, we introduce an automated sufficiency-based dataset transformation that considers all possible partitions of supporting facts, capturing disconnected reasoning. Second, we develop a probe to measure how much can a model cheat (via non-multifact reasoning) on existing tests and our sufficiency test. Third, we conduct experiments using a transformer based model (XLNet), demonstrating that the sufficiency transform not only reduces the amount of non-multifact reasoning in this model by 6.5 model sees a 20.8

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/25/2018

HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering

Existing question answering (QA) datasets fail to train QA systems to pe...
research
10/25/2019

QASC: A Dataset for Question Answering via Sentence Composition

Composing knowledge from multiple pieces of texts is a key challenge in ...
research
10/01/2019

Identifying Supporting Facts for Multi-hop Question Answering with Document Graph Networks

Recent advances in reading comprehension have resulted in models that su...
research
05/24/2022

From Easy to Hard: Two-stage Selector and Reader for Multi-hop Question Answering

Multi-hop question answering (QA) is a challenging task requiring QA sys...
research
12/14/2022

APOLLO: An Optimized Training Approach for Long-form Numerical Reasoning

Long-form numerical reasoning in financial analysis aims to generate a r...
research
12/16/2021

Utilizing Evidence Spans via Sequence-Level Contrastive Learning for Long-Context Question Answering

Long-range transformer models have achieved encouraging results on long-...
research
10/21/2022

WikiWhy: Answering and Explaining Cause-and-Effect Questions

As large language models (LLMs) grow larger and more sophisticated, asse...

Please sign up or login with your details

Forgot password? Click here to reset