Dense Hierarchical Retrieval for Open-Domain Question Answering

10/28/2021
by   Ye Liu, et al.
0

Dense neural text retrieval has achieved promising results on open-domain Question Answering (QA), where latent representations of questions and passages are exploited for maximum inner product search in the retrieval process. However, current dense retrievers require splitting documents into short passages that usually contain local, partial, and sometimes biased context, and highly depend on the splitting process. As a consequence, it may yield inaccurate and misleading hidden representations, thus deteriorating the final retrieval result. In this work, we propose Dense Hierarchical Retrieval (DHR), a hierarchical framework that can generate accurate dense representations of passages by utilizing both macroscopic semantics in the document and microscopic semantics specific to each passage. Specifically, a document-level retriever first identifies relevant documents, among which relevant passages are then retrieved by a passage-level retriever. The ranking of the retrieved passages will be further calibrated by examining the document-level relevance. In addition, hierarchical title structure and two negative sampling strategies (i.e., In-Doc and In-Sec negatives) are investigated. We apply DHR to large-scale open-domain QA datasets. DHR significantly outperforms the original dense passage retriever and helps an end-to-end QA system outperform the strong baselines on multiple open-domain QA benchmarks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/10/2020

Dense Passage Retrieval for Open-Domain Question Answering

Open-domain question answering relies on efficient passage retrieval to ...
research
03/22/2021

Open Domain Question Answering over Tables via Dense Retrieval

Recent advances in open-domain QA have led to strong models based on den...
research
06/21/2022

Questions Are All You Need to Train a Dense Passage Retriever

We introduce ART, a new corpus-level autoencoding approach for training ...
research
04/06/2023

Revisiting Dense Retrieval with Unanswerable Counterfactuals

The retriever-reader framework is popular for open-domain question answe...
research
04/18/2021

Simple and Efficient ways to Improve REALM

Dense retrieval has been shown to be effective for retrieving relevant d...
research
10/14/2021

Representation Decoupling for Open-Domain Passage Retrieval

Training dense passage representations via contrastive learning (CL) has...
research
04/12/2021

A Replication Study of Dense Passage Retriever

Text retrieval using learned dense representations has recently emerged ...

Please sign up or login with your details

Forgot password? Click here to reset