Learning to Collaborate: Multi-Scenario Ranking via Multi-Agent Reinforcement Learning

09/17/2018
by   Jun Feng, et al.
6

Ranking is a fundamental and widely studied problem in scenarios such as search, advertising, and recommendation. However, joint optimization for multi-scenario ranking, which aims to improve the overall performance of several ranking strategies in different scenarios, is rather untouched. Separately optimizing each individual strategy has two limitations. The first one is lack of collaboration between scenarios meaning that each strategy maximizes its own objective but ignores the goals of other strategies, leading to a sub-optimal overall performance. The second limitation is the inability of modeling the correlation between scenarios meaning that independent optimization in one scenario only uses its own user data but ignores the context in other scenarios. In this paper, we formulate multi-scenario ranking as a fully cooperative, partially observable, multi-agent sequential decision problem. We propose a novel model named Multi-Agent Recurrent Deterministic Policy Gradient (MA-RDPG) which has a communication component for passing messages, several private actors (agents) for making actions for ranking, and a centralized critic for evaluating the overall performance of the co-working actors. Each scenario is treated as an agent (actor). Agents collaborate with each other by sharing a global action-value function (the critic) and passing messages that encodes historical information across scenarios. The model is evaluated with online settings on a large E-commerce platform. Results show that the proposed model exhibits significant improvements against baselines in terms of the overall performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/13/2018

CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement Learning

We propose CM3, a new deep reinforcement learning method for cooperative...
research
06/29/2023

Multi-Scenario Ranking with Adaptive Feature Learning

Recently, Multi-Scenario Learning (MSL) is widely used in recommendation...
research
02/18/2019

Message-Dropout: An Efficient Training Method for Multi-Agent Deep Reinforcement Learning

In this paper, we propose a new learning technique named message-dropout...
research
02/11/2019

Model-Based Reinforcement Learning for Whole-Chain Recommendations

With the recent prevalence of Reinforcement Learning (RL), there have be...
research
11/22/2021

Plan Better Amid Conservatism: Offline Multi-Agent Reinforcement Learning with Actor Rectification

The idea of conservatism has led to significant progress in offline rein...
research
08/21/2020

Learning to Collaborate in Multi-Module Recommendation via Multi-Agent Reinforcement Learning without Communication

With the rise of online e-commerce platforms, more and more customers pr...
research
09/25/2019

α^α-Rank: Scalable Multi-agent Evaluation through Evolution

Although challenging, strategy profile evaluation in large connected lea...

Please sign up or login with your details

Forgot password? Click here to reset