CPS-MEBR: Click Feedback-Aware Web Page Summarization for Multi-Embedding-Based Retrieval

10/18/2022
by   Wenbiao Li, et al.
0

Embedding-based retrieval (EBR) is a technique to use embeddings to represent query and document, and then convert the retrieval problem into a nearest neighbor search problem in the embedding space. Some previous works have mainly focused on representing the web page with a single embedding, but in real web search scenarios, it is difficult to represent all the information of a long and complex structured web page as a single embedding. To address this issue, we design a click feedback-aware web page summarization for multi-embedding-based retrieval (CPS-MEBR) framework which is able to generate multiple embeddings for web pages to match different potential queries. Specifically, we use the click data of users in search logs to train a summary model to extract those sentences in web pages that are frequently clicked by users, which are more likely to answer those potential queries. Meanwhile, we introduce sentence-level semantic interaction to design a multi-embedding-based retrieval (MEBR) model, which can generate multiple embeddings to deal with different potential queries by using frequently clicked sentences in web pages. Offline experiments show that it can perform high quality candidate retrieval compared to single-embedding-based retrieval (SEBR) model.

READ FULL TEXT
research
06/21/2023

Comparative analysis of various web crawler algorithms

This presentation focuses on the importance of web crawling and page ran...
research
10/15/2018

Mapping Web Pages by Internet Protocol (IP) addresses: Analyzing Spatial and Temporal Characteristics of Web Search Engine Results

Internet Protocol (IP) addresses are frequently used as a method of loca...
research
08/08/2016

Learning Joint Representations of Videos and Sentences with Web Image Search

Our objective is video retrieval based on natural language queries. In a...
research
08/10/2020

Beyond Lexical: A Semantic Retrieval Framework for Textual SearchEngine

Search engine has become a fundamental component in various web and mobi...
research
10/05/2022

Contextualized Generative Retrieval

The text retrieval task is mainly performed in two ways: the bi-encoder ...
research
09/17/2020

Online Algorithms for Estimating Change Rates of Web Pages

For providing quick and accurate search results, a search engine maintai...
research
04/19/2023

WASEF: Web Acceleration Solutions Evaluation Framework

The World Wide Web has become increasingly complex in recent years. This...

Please sign up or login with your details

Forgot password? Click here to reset