End-to-End Retrieval in Continuous Space

11/19/2018
by   Daniel Gillick, et al.
0

Most text-based information retrieval (IR) systems index objects by words or phrases. These discrete systems have been augmented by models that use embeddings to measure similarity in continuous space. But continuous-space models are typically used just to re-rank the top candidates. We consider the problem of end-to-end continuous retrieval, where standard approximate nearest neighbor (ANN) search replaces the usual discrete inverted index, and rely entirely on distances between learned embeddings. By training simple models specifically for retrieval, with an appropriate model architecture, we improve on a discrete baseline by 8 tasks. We also discuss the problem of evaluation for retrieval systems, and show how to modify existing pairwise similarity datasets for this purpose.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/24/2023

Constructing Tree-based Index for Efficient and Effective Dense Retrieval

Recent studies have shown that Dense Retrieval (DR) techniques can signi...
research
05/09/2021

Joint Learning of Deep Retrieval Model and Product Quantization based Embedding Index

Embedding index that enables fast approximate nearest neighbor(ANN) sear...
research
07/01/2020

Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval

Conducting text retrieval in a dense learned representation space has ma...
research
05/21/2023

Retrieving Texts based on Abstract Descriptions

In this work, we aim to connect two research areas: instruction models a...
research
10/16/2017

A retrieval-based dialogue system utilizing utterance and context embeddings

Finding semantically rich and computer-understandable representations fo...
research
07/02/2021

Ascent Similarity Caching with Approximate Indexes

Similarity search is a key operation in multimedia retrieval systems and...

Please sign up or login with your details

Forgot password? Click here to reset