How Well Do Multi-hop Reading Comprehension Models Understand Date Information?

10/11/2022
by   Xanh Ho, et al.
0

Several multi-hop reading comprehension datasets have been proposed to resolve the issue of reasoning shortcuts by which questions can be answered without performing multi-hop reasoning. However, the ability of multi-hop models to perform step-by-step reasoning when finding an answer to a comparison question remains unclear. It is also unclear how questions about the internal reasoning process are useful for training and evaluating question-answering (QA) systems. To evaluate the model precisely in a hierarchical manner, we first propose a dataset, HieraDate, with three probing tasks in addition to the main question: extraction, reasoning, and robustness. Our dataset is created by enhancing two previous multi-hop datasets, HotpotQA and 2WikiMultiHopQA, focusing on multi-hop questions on date information that involve both comparison and numerical reasoning. We then evaluate the ability of existing models to understand date information. Our experimental results reveal that the multi-hop models do not have the ability to subtract two dates even when they perform well in date comparison and number subtraction tasks. Other results reveal that our probing questions can help to improve the performance of the models (e.g., by +10.3 F1) on the main QA task and our dataset can be used for data augmentation to improve the robustness of the models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/07/2022

Complex Reading Comprehension Through Question Decomposition

Multi-hop reading comprehension requires not only the ability to reason ...
research
04/27/2019

Understanding Dataset Design Choices for Multi-hop Reasoning

Learning multi-hop reasoning has been a key challenge for reading compre...
research
02/12/2023

Analyzing the Effectiveness of the Underlying Reasoning Tasks in Multi-hop Question Answering

To explain the predicted answers and evaluate the reasoning abilities of...
research
09/17/2018

Commonsense for Generative Multi-Hop Question Answering Tasks

Reading comprehension QA tasks have seen a recent surge in popularity, y...
research
05/14/2019

Cognitive Graph for Multi-Hop Reading Comprehension at Scale

We propose a new CogQA framework for multi-hop question answering in web...
research
04/18/2022

StepGame: A New Benchmark for Robust Multi-Hop Spatial Reasoning in Texts

Inferring spatial relations in natural language is a crucial ability an ...
research
08/02/2021

MuSiQue: Multi-hop Questions via Single-hop Question Composition

To build challenging multi-hop question answering datasets, we propose a...

Please sign up or login with your details

Forgot password? Click here to reset