MDDL: A Framework for Reinforcement Learning-based Position Allocation in Multi-Channel Feed

04/17/2023
by   Xiaowen Shi, et al.
0

Nowadays, the mainstream approach in position allocation system is to utilize a reinforcement learning model to allocate appropriate locations for items in various channels and then mix them into the feed. There are two types of data employed to train reinforcement learning (RL) model for position allocation, named strategy data and random data. Strategy data is collected from the current online model, it suffers from an imbalanced distribution of state-action pairs, resulting in severe overestimation problems during training. On the other hand, random data offers a more uniform distribution of state-action pairs, but is challenging to obtain in industrial scenarios as it could negatively impact platform revenue and user experience due to random exploration. As the two types of data have different distributions, designing an effective strategy to leverage both types of data to enhance the efficacy of the RL model training has become a highly challenging problem. In this study, we propose a framework named Multi-Distribution Data Learning (MDDL) to address the challenge of effectively utilizing both strategy and random data for training RL models on mixed multi-distribution data. Specifically, MDDL incorporates a novel imitation learning signal to mitigate overestimation problems in strategy data and maximizes the RL signal for random data to facilitate effective learning. In our experiments, we evaluated the proposed MDDL framework in a real-world position allocation system and demonstrated its superior performance compared to the previous baseline. MDDL has been fully deployed on the Meituan food delivery platform and currently serves over 300 million users.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/02/2022

Learning List-wise Representation in Reinforcement Learning for Ads Allocation with Multiple Auxiliary Tasks

With the recent prevalence of reinforcement learning (RL), there have be...
research
04/02/2022

Hybrid Transfer in Deep Reinforcement Learning for Ads Allocation

Ads allocation, that allocates ads and organic items to limited slots in...
research
04/01/2022

Deep Page-Level Interest Network in Reinforcement Learning for Ads Allocation

A mixed list of ads and organic items is usually displayed in feed and h...
research
09/09/2021

Cross DQN: Cross Deep Q Network for Ads Allocation in Feed

E-commerce platforms usually display a mixed list of ads and organic ite...
research
04/28/2023

A Federated Reinforcement Learning Framework for Link Activation in Multi-link Wi-Fi Networks

Next-generation Wi-Fi networks are looking forward to introducing new fe...
research
09/09/2022

A Memory-Related Multi-Task Method Based on Task-Agnostic Exploration

We pose a new question: Can agents learn how to combine actions from pre...

Please sign up or login with your details

Forgot password? Click here to reset