Log In Sign Up

Using the Hammer Only on Nails: A Hybrid Method for Evidence Retrieval for Question Answering

by   Zhengzhong Liang, et al.

Evidence retrieval is a key component of explainable question answering (QA). We argue that, despite recent progress, transformer network-based approaches such as universal sentence encoder (USE-QA) do not always outperform traditional information retrieval (IR) methods such as BM25 for evidence retrieval for QA. We introduce a lexical probing task that validates this observation: we demonstrate that neural IR methods have the capacity to capture lexical differences between questions and answers, but miss obvious lexical overlap signal. Learning from this probing analysis, we introduce a hybrid approach for evidence retrieval that combines the advantages of both IR directions. Our approach uses a routing classifier that learns when to direct incoming questions to BM25 vs. USE-QA for evidence retrieval using very simple statistics, which can be efficiently extracted from the top candidate evidence sentences produced by a BM25 model. We demonstrate that this hybrid evidence retrieval generally performs better than either individual retrieval strategy on three QA datasets: OpenBookQA, ReQA SQuAD, and ReQA NQ. Furthermore, we show that the proposed routing strategy is considerably faster than neural methods, with a runtime that is up to 5 times faster than USE-QA.


page 1

page 2

page 3

page 4


Multi-step Entity-centric Information Retrieval for Multi-Hop Question Answering

Multi-hop question answering (QA) requires an information retrieval (IR)...

Latent Retrieval for Weakly Supervised Open Domain Question Answering

Recent work on open domain question answering (QA) assumes strong superv...

Unsupervised Alignment-based Iterative Evidence Retrieval for Multi-hop Question Answering

Evidence retrieval is a critical stage of question answering (QA), neces...

Frustratingly Hard Evidence Retrieval for QA Over Books

A lot of progress has been made to improve question answering (QA) in re...

Can Question Generation Debias Question Answering Models? A Case Study on Question-Context Lexical Overlap

Question answering (QA) models for reading comprehension have been demon...

Progressively Pretrained Dense Corpus Index for Open-Domain Question Answering

To extract answers from a large corpus, open-domain question answering (...

Relevance-guided Supervision for OpenQA with ColBERT

Systems for Open-Domain Question Answering (OpenQA) generally depend on ...