Log In Sign Up

SPARTA: Efficient Open-Domain Question Answering via Sparse Transformer Matching Retrieval

by   Tiancheng Zhao, et al.

We introduce SPARTA, a novel neural retrieval method that shows great promise in performance, generalization, and interpretability for open-domain question answering. Unlike many neural ranking methods that use dense vector nearest neighbor search, SPARTA learns a sparse representation that can be efficiently implemented as an Inverted Index. The resulting representation enables scalable neural retrieval that does not require expensive approximate vector search and leads to better performance than its dense counterpart. We validated our approaches on 4 open-domain question answering (OpenQA) tasks and 11 retrieval question answering (ReQA) tasks. SPARTA achieves new state-of-the-art results across a variety of open-domain question answering tasks in both English and Chinese datasets, including open SQuAD, Natuarl Question, CMRC and etc. Analysis also confirms that the proposed method creates human interpretable representation and allows flexible control over the trade-off between performance and efficiency.


page 1

page 2

page 3

page 4


Dense Passage Retrieval for Open-Domain Question Answering

Open-domain question answering relies on efficient passage retrieval to ...

Improving Passage Retrieval with Zero-Shot Question Generation

We propose a simple and effective re-ranking method for improving passag...

Generation-Augmented Retrieval for Open-domain Question Answering

Conventional sparse retrieval methods such as TF-IDF and BM25 are simple...

Progressively Pretrained Dense Corpus Index for Open-Domain Question Answering

To extract answers from a large corpus, open-domain question answering (...

Multi-step Retriever-Reader Interaction for Scalable Open-domain Question Answering

This paper introduces a new framework for open-domain question answering...

Towards Universal Dense Retrieval for Open-domain Question Answering

In open-domain question answering, a model receives a text question as i...

Text Embeddings for Retrieval From a Large Knowledge Base

Text embedding representing natural language documents in a semantic vec...