End-to-End Multihop Retrieval for Compositional Question Answering over Long Documents

06/01/2021
by   Haitian Sun, et al.
0

Answering complex questions from long documents requires aggregating multiple pieces of evidence and then predicting the answers. In this paper, we propose a multi-hop retrieval method, DocHopper, to answer compositional questions over long documents. At each step, DocHopper retrieves a paragraph or sentence embedding from the document, mixes the retrieved result with the query, and updates the query for the next step. In contrast to many other retrieval-based methods (e.g., RAG or REALM) the query is not augmented with a token sequence: instead, it is augmented by "numerically" combining it with another neural representation. This means that model is end-to-end differentiable. We demonstrate that utilizing document structure in this was can largely improve question-answering and retrieval performance on long documents. We experimented with DocHopper on three different QA tasks that require reading long documents to answer compositional questions: discourse entailment reasoning, factual QA with table and text, and information seeking QA from academic papers. DocHopper outperforms all baseline models and achieves state-of-the-art results on all datasets. Additionally, DocHopper is efficient at inference time, being 3 10 times faster than the baselines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/13/2021

ConditionalQA: A Complex Reading Comprehension Dataset with Conditional Answers

We describe a Question Answering (QA) dataset that contains complex ques...
research
09/16/2023

PDFTriage: Question Answering over Long, Structured Documents

Large Language Models (LLMs) have issues with document question answerin...
research
06/09/2016

Key-Value Memory Networks for Directly Reading Documents

Directly reading documents and being able to answer questions from them ...
research
07/10/2019

ReQA: An Evaluation for End-to-End Answer Retrieval Models

Popular QA benchmarks like SQuAD have driven progress on the task of ide...
research
08/31/2021

When Retriever-Reader Meets Scenario-Based Multiple-Choice Questions

Scenario-based question answering (SQA) requires retrieving and reading ...
research
06/09/2021

End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering

We present an end-to-end differentiable training method for retrieval-au...
research
05/21/2018

Efficient and Robust Question Answering from Minimal Context over Documents

Neural models for question answering (QA) over documents have achieved s...

Please sign up or login with your details

Forgot password? Click here to reset