Text Embeddings for Retrieval From a Large Knowledge Base

10/24/2018
by   Tolgahan Cakaloglu, et al.
4

Text embedding representing natural language documents in a semantic vector space can be used for document retrieval using nearest neighbor lookup. In order to study the feasibility of neural models specialized for retrieval in a semantically meaningful way, we suggest the use of the Stanford Question Answering Dataset (SQuAD) in an open-domain question answering context, where the first task is to find paragraphs useful for answering a given question. First, we compare the quality of various text-embedding methods on the performance of retrieval and give an extensive empirical comparison on the performance of various non-augmented base embedding with, and without IDF weighting. Our main results are that by training deep residual neural models specifically for retrieval purposes can yield significant gains when it is used to augment existing embeddings. We also establish that deeper models are superior to this task. The best base baseline embeddings augmented by our learned neural approach improves the top-1 paragraph recall of the system by 14

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/02/2019

A Multi-Resolution Word Embedding for Document Retrieval from Large Unstructured Knowledge Bases

Deep language models learning a hierarchical representation proved to be...
research
09/28/2020

SPARTA: Efficient Open-Domain Question Answering via Sparse Transformer Matching Retrieval

We introduce SPARTA, a novel neural retrieval method that shows great pr...
research
09/04/2018

Straight to the Facts: Learning Knowledge Base Retrieval for Factual Visual Question Answering

Question answering is an important task for autonomous agents and virtua...
research
10/06/2022

Improving the Domain Adaptation of Retrieval Augmented Generation (RAG) Models for Open Domain Question Answering

Retrieval Augment Generation (RAG) is a recent advancement in Open-Domai...
research
10/05/2022

Contextualized Generative Retrieval

The text retrieval task is mainly performed in two ways: the bi-encoder ...
research
07/07/2023

TRAC: Trustworthy Retrieval Augmented Chatbot

Although conversational AIs have demonstrated fantastic performance, the...
research
04/14/2021

Static Embeddings as Efficient Knowledge Bases?

Recent research investigates factual knowledge stored in large pretraine...

Please sign up or login with your details

Forgot password? Click here to reset