Optimizing AD Pruning of Sponsored Search with Reinforcement Learning

by   Yijiang Lian, et al.

Industrial sponsored search system (SSS) can be logically divided into three modules: keywords matching, ad retrieving, and ranking. During ad retrieving, the ad candidates grow exponentially. A query with high commercial value might retrieve a great deal of ad candidates such that the ranking module could not afford. Due to limited latency and computing resources, the candidates have to be pruned earlier. Suppose we set a pruning line to cut SSS into two parts: upstream and downstream. The problem we are going to address is: how to pick out the best K items from N candidates provided by the upstream to maximize the total system's revenue. Since the industrial downstream is very complicated and updated quickly, a crucial restriction in this problem is that the selection scheme should get adapted to the downstream. In this paper, we propose a novel model-free reinforcement learning approach to fixing this problem. Our approach considers downstream as a black-box environment, and the agent sequentially selects items and finally feeds into the downstream, where revenue would be estimated and used as a reward to improve the selection policy. To the best of our knowledge, this is first time to consider the system optimization from a downstream adaption view. It is also the first time to use reinforcement learning techniques to tackle this problem. The idea has been successfully realized in Baidu's sponsored search system, and online long time A/B test shows remarkable improvements on revenue.


page 1

page 2

page 3

page 4


EENMF: An End-to-End Neural Matching Framework for E-Commerce Sponsored Search

E-commerce sponsored search contributes an important part of revenue for...

Optimizing Sponsored Search Ranking Strategy by Deep Reinforcement Learning

Sponsored search is an indispensable business model and a major revenue ...

Search and Score-Based Waterfall Auction Optimization

Online advertising is a major source of income for many online companies...

Generator and Critic: A Deep Reinforcement Learning Approach for Slate Re-ranking in E-commerce

The slate re-ranking problem considers the mutual influences between ite...

Reinforcement Learning with Depreciating Assets

A basic assumption of traditional reinforcement learning is that the val...

Metareasoning in Modular Software Systems: On-the-Fly Configuration using Reinforcement Learning with Rich Contextual Representations

Assemblies of modular subsystems are being pressed into service to perfo...

Learning-To-Ensemble by Contextual Rank Aggregation in E-Commerce

Ensemble models in E-commerce combine predictions from multiple sub-mode...

Please sign up or login with your details

Forgot password? Click here to reset