Beyond Lexical: A Semantic Retrieval Framework for Textual SearchEngine

08/10/2020
by   Kuan Fang, et al.
0

Search engine has become a fundamental component in various web and mobile applications. Retrieving relevant documents from the massive datasets is challenging for a search engine system, especially when faced with verbose or tail queries. In this paper, we explore a vector space search framework for document retrieval. Specifically, we trained a deep semantic matching model so that each query and document can be encoded as a low dimensional embedding. Our model was trained based on BERT architecture. We deployed a fast k-nearest-neighbor index service for online serving. Both offline and online metrics demonstrate that our method improved retrieval performance and search quality considerably, particularly for tail

READ FULL TEXT

page 1

page 2

page 3

page 4

07/03/2020

MIRA: Leveraging Multi-Intention Co-click Information in Web-scale Document Retrieval using Deep Neural Networks

We study the problem of deep recall model in industrial web search, whic...
06/07/2021

Pre-trained Language Model for Web-scale Retrieval in Baidu Search

Retrieval is a crucial stage in web search that identifies a small set o...
11/30/2018

Learning From Weights: A Cost-Sensitive Approach For Ad Retrieval

Retrieval models such as CLSM is trained on click-through data which tre...
07/01/2021

SearchGCN: Powering Embedding Retrieval by Graph Convolution Networks for E-Commerce Search

Graph convolution networks (GCN), which recently becomes new state-of-th...
11/30/2018

Cost-sensitive Learning of Deep Semantic Models for Sponsored Ad Retrieval

This paper formulates the problem of learning a neural semantic model fo...
10/27/2020

Semantic Search in Millions of Equations

Given the increase of publications, search for relevant papers becomes t...
04/12/2018

A Capsule Network-based Embedding Model for Search Personalization

Search personalization aims to tailor search results to each specific us...