Transformer-Based Language Models for Similar Text Retrieval and Ranking

05/10/2020
by   Javed Qadrud-Din, et al.
0

Most approaches for similar text retrieval and ranking with long natural language queries rely at some level on queries and responses having words in common with each other. Recent applications of transformer-based neural language models to text retrieval and ranking problems have been very promising, but still involve a two-step process in which result candidates are first obtained through bag-of-words-based approaches, and then reranked by a neural transformer. In this paper, we introduce novel approaches for effectively applying neural transformer models to similar text retrieval and ranking without an initial bag-of-words-based step. By eliminating the bag-of-words-based step, our approach is able to accurately retrieve and rank results even when they have no non-stopwords in common with the query. We accomplish this by using bidirectional encoder representations from transformers (BERT) to create vectorized representations of sentence-length texts, along with a vector nearest neighbor search index. We demonstrate both supervised and unsupervised means of using BERT to accomplish this task.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/20/2020

CoRT: Complementary Rankings from Transformers

Recent approaches towards passage retrieval have successfully employed r...
research
10/13/2020

Pretrained Transformers for Text Ranking: BERT and Beyond

The goal of text ranking is to generate an ordered list of texts retriev...
research
02/05/2020

Experiments with Different Indexing Techniques for Text Retrieval tasks on Gujarati Language using Bag of Words Approach

This paper presents results of various experiments carried out to improv...
research
10/23/2019

Context-Aware Sentence/Passage Term Importance Estimation For First Stage Retrieval

Term frequency is a common method for identifying the importance of a te...
research
10/15/2021

Cascaded Fast and Slow Models for Efficient Semantic Code Search

The goal of natural language semantic code search is to retrieve a seman...
research
12/28/2013

Stopping Rules for Bag-of-Words Image Search and Its Application in Appearance-Based Localization

We propose a technique to improve the search efficiency of the bag-of-wo...

Please sign up or login with your details

Forgot password? Click here to reset