Analyzing the Effectiveness of the Underlying Reasoning Tasks in Multi-hop Question Answering

02/12/2023
by   Xanh Ho, et al.
0

To explain the predicted answers and evaluate the reasoning abilities of models, several studies have utilized underlying reasoning (UR) tasks in multi-hop question answering (QA) datasets. However, it remains an open question as to how effective UR tasks are for the QA task when training models on both tasks in an end-to-end manner. In this study, we address this question by analyzing the effectiveness of UR tasks (including both sentence-level and entity-level tasks) in three aspects: (1) QA performance, (2) reasoning shortcuts, and (3) robustness. While the previous models have not been explicitly trained on an entity-level reasoning prediction task, we build a multi-task model that performs three tasks together: sentence-level supporting facts prediction, entity-level reasoning prediction, and answer prediction. Experimental results on 2WikiMultiHopQA and HotpotQA-small datasets reveal that (1) UR tasks can improve QA performance. Using four debiased datasets that are newly created, we demonstrate that (2) UR tasks are helpful in preventing reasoning shortcuts in the multi-hop QA task. However, we find that (3) UR tasks do not contribute to improving the robustness of the model on adversarial questions, such as sub-questions and inverted questions. We encourage future studies to investigate the effectiveness of entity-level reasoning in the form of natural language questions (e.g., sub-question forms).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/17/2021

Reasoning Chain Based Adversarial Attack for Multi-hop Question Answering

Recent years have witnessed impressive advances in challenging multi-hop...
research
10/09/2022

Understanding and Improving Zero-shot Multi-hop Reasoning in Generative Question Answering

Generative question answering (QA) models generate answers to questions ...
research
01/05/2022

Does entity abstraction help generative Transformers reason?

Pre-trained language models (LMs) often struggle to reason logically or ...
research
10/05/2022

Decomposed Prompting: A Modular Approach for Solving Complex Tasks

Few-shot prompting is a surprisingly powerful way to use Large Language ...
research
10/11/2022

How Well Do Multi-hop Reading Comprehension Models Understand Date Information?

Several multi-hop reading comprehension datasets have been proposed to r...
research
04/27/2019

Understanding Dataset Design Choices for Multi-hop Reasoning

Learning multi-hop reasoning has been a key challenge for reading compre...
research
09/01/2020

Text Modular Networks: Learning to Decompose Tasks in the Language of Existing Models

A common approach to solve complex tasks is by breaking them down into s...

Please sign up or login with your details

Forgot password? Click here to reset