Retrieve Synonymous keywords for Frequent Queries in Sponsored Search in a Data Augmentation Way

08/05/2020
by   Yijiang Lian, et al.
0

In sponsored search, retrieving synonymous keywords is of great importance for accurately targeted advertising. The semantic gap between queries and keywords and the extremely high precision requirements (>= 95%) are two major challenges to this task. To the best of our knowledge, the problem has not been openly discussed. In an industrial sponsored search system, the retrieved keywords for frequent queries are usually done ahead of time and stored in a lookup table. Considering these results as a seed dataset, we propose a data-augmentation-like framework to improve the synonymous retrieval performance for these frequent queries. This framework comprises two steps: translation-based retrieval and discriminant-based filtering. Firstly, we devise a Trie-based translation model to make a data increment. In this phase, a Bag-of-Core-Words trick is conducted, which increased the data increment's volume 4.2 times while keeping the original precision. Then we use a BERT-based discriminant model to filter out nonsynonymous pairs, which exceeds the traditional feature-driven GBDT model with 11% absolute AUC improvement. This method has been successfully applied to Baidu's sponsored search system, which has yielded a significant improvement in revenue. In addition, a commercial Chinese dataset containing 500K synonymous pairs with a precision of 95% is released to the public for paraphrase study (http://ai.baidu.com/broad/subordinate?dataset=paraphrasing).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/21/2021

A Concept Knowledge-Driven Keywords Retrieval Framework for Sponsored Search

In sponsored search, retrieving synonymous keywords for exact match type...
research
06/07/2021

Diversity driven Query Rewriting in Search Advertising

Retrieving keywords (bidwords) with the same intent as query, referred t...
research
02/02/2019

An end-to-end Generative Retrieval Method for Sponsored Search Engine --Decoding Efficiently into a Closed Target Domain

In this paper, we present a generative retrieval method for sponsored se...
research
10/28/2019

RPM-Oriented Query Rewriting Framework for E-commerce Keyword-Based Sponsored Search

Sponsored search optimizes revenue and relevance, which is estimated by ...
research
09/13/2022

HEARTS: Multi-task Fusion of Dense Retrieval and Non-autoregressive Generation for Sponsored Search

Matching user search queries with relevant keywords bid by advertisers i...
research
04/10/2023

LADER: Log-Augmented DEnse Retrieval for Biomedical Literature Search

Queries with similar information needs tend to have similar document cli...
research
01/03/2019

Dataset search: a survey

Generating value from data requires the ability to find, access and make...

Please sign up or login with your details

Forgot password? Click here to reset