Hybrid Transfer in Deep Reinforcement Learning for Ads Allocation

04/02/2022
by   Guogang Liao, et al.
0

Ads allocation, that allocates ads and organic items to limited slots in feed with the purpose of maximizing platform revenue, has become a popular problem. However, e-commerce platforms usually have multiple entrances for different categories and some entrances have few visits. Data accumulated on these entrances can hardly support the learning of a good agent. To address this challenge, we present Similarity-based Hybrid Transfer for Ads Allocation (SHTAA), which can effectively transfer the samples as well as the knowledge from data-rich entrance to other data-poor entrance. Specifically, we define an uncertainty-aware Markov Decision Process (MDP) similarity which can estimate the MDP similarity of different entrances. Based on the MDP similarity, we design a hybrid transfer method (consisting of instance transfer and strategy transfer) to efficiently transfer the samples and the knowledge from one entrance to another. Both offline and online experiments on Meituan food delivery platform demonstrate that our method can help to learn better agent for data-poor entrance and increase the revenue for the platform.

READ FULL TEXT
research
04/02/2022

Learning List-wise Representation in Reinforcement Learning for Ads Allocation with Multiple Auxiliary Tasks

With the recent prevalence of reinforcement learning (RL), there have be...
research
12/05/2019

Dynamic Pricing on E-commerce Platform with Deep Reinforcement Learning

In this paper we present an end-to-end framework for addressing the prob...
research
04/17/2023

MDDL: A Framework for Reinforcement Learning-based Position Allocation in Multi-Channel Feed

Nowadays, the mainstream approach in position allocation system is to ut...
research
08/25/2017

A deep reinforcement learning framework for allocating buyer impressions in e-commerce websites

We study the problem of allocating impressions to sellers in e-commerce ...
research
08/25/2017

Reinforcement Mechanism Design for e-commerce

We study the problem of allocating impressions to sellers in e-commerce ...
research
03/07/2019

Can Sophisticated Dispatching Strategy Acquired by Reinforcement Learning? - A Case Study in Dynamic Courier Dispatching System

In this paper, we study a courier dispatching problem (CDP) raised from ...
research
04/01/2022

Deep Page-Level Interest Network in Reinforcement Learning for Ads Allocation

A mixed list of ads and organic items is usually displayed in feed and h...

Please sign up or login with your details

Forgot password? Click here to reset