Reinforcement Learning to Rank in E-Commerce Search Engine: Formalization, Analysis, and Application

03/02/2018
by   Yujing Hu, et al.
0

In e-commerce platforms such as Amazon and TaoBao, ranking items in a search session is a typical multi-step decision-making problem. Learning to rank (LTR) methods have been widely applied to ranking problems. However, such methods often consider different ranking steps in a session to be independent, which conversely may be highly correlated to each other. For better utilizing the correlation between different ranking steps, in this paper, we propose to use reinforcement learning (RL) to learn an optimal ranking policy which maximizes the expected accumulative rewards in a search session. Firstly, we formally define the concept of search session Markov decision process (SSMDP) to formulate the multi-step ranking problem. Secondly, we analyze the property of SSMDP and theoretically prove the necessity of maximizing accumulative rewards. Lastly, we propose a novel policy gradient algorithm for learning an optimal ranking policy, which is able to deal with the problem of high reward variance and unbalanced reward distribution of an SSMDP. Experiments are conducted in simulation and TaoBao search engine. The results demonstrate that our algorithm performs much better than online LTR methods, with more than 40 of total transaction amount in the simulation and the real application, respectively.

READ FULL TEXT

page 3

page 8

research
06/13/2023

Unified Off-Policy Learning to Rank: a Reinforcement Learning Perspective

Off-policy Learning to Rank (LTR) aims to optimize a ranker from data co...
research
06/24/2019

Ranking Policy Gradient

Sample inefficiency is a long-lasting problem in reinforcement learning ...
research
08/27/2023

CTR is not Enough: a Novel Reinforcement Learning based Ranking Approach for Optimizing Session Clicks

Ranking is a crucial module using in the recommender system. In particul...
research
04/12/2019

Similarities between policy gradient methods (PGM) in Reinforcement learning (RL) and supervised learning (SL)

Reinforcement learning (RL) is about sequential decision making and is t...
research
08/21/2020

Learning to Collaborate in Multi-Module Recommendation via Multi-Agent Reinforcement Learning without Communication

With the rise of online e-commerce platforms, more and more customers pr...
research
06/02/2022

Incrementality Bidding via Reinforcement Learning under Mixed and Delayed Rewards

Incrementality, which is used to measure the causal effect of showing an...
research
03/02/2018

Accelerating E-Commerce Search Engine Ranking by Contextual Factor Selection

In industrial large-scale search systems, such as Taobao.com search for ...

Please sign up or login with your details

Forgot password? Click here to reset