MoTiAC: Multi-Objective Actor-Critics for Real-Time Bidding

02/18/2020
by   Chaoqi Yang, et al.
0

Online real-time bidding (RTB) is known as a complex auction game where ad platforms seek to consider various influential key performance indicators (KPIs), like revenue and return on investment (ROI). The trade-off among these competing goals needs to be balanced on a massive scale. To address the problem, we propose a multi-objective reinforcement learning algorithm, named MoTiAC, for the problem of bidding optimization with various goals. Specifically, in MoTiAC, instead of using a fixed and linear combination of multiple objectives, we compute adaptive weights overtime on the basis of how well the current state agrees with the agent's prior. In addition, we provide interesting properties of model updating and further prove that Pareto optimality could be guaranteed. We demonstrate the effectiveness of our method on a real-world commercial dataset. Experiments show that the model outperforms all state-of-the-art baselines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/08/2023

A Scale-Independent Multi-Objective Reinforcement Learning with Convergence Analysis

Many sequential decision-making problems need optimization of different ...
research
12/19/2022

Taming Lagrangian Chaos with Multi-Objective Reinforcement Learning

We consider the problem of two active particles in 2D complex flows with...
research
02/08/2023

Sample-efficient Multi-objective Molecular Optimization with GFlowNets

Many crucial scientific problems involve designing novel molecules with ...
research
06/08/2021

Multi-Agent Cooperative Bidding Games for Multi-Objective Optimization in e-Commercial Sponsored Search

Bid optimization for online advertising from single advertiser's perspec...
research
12/30/2021

MORAL: Aligning AI with Human Norms through Multi-Objective Reinforced Active Learning

Inferring reward functions from demonstrations and pairwise preferences ...
research
07/01/2022

Multi-Objective Coordination Graphs for the Expected Scalarised Returns with Generative Flow Models

Many real-world problems contain multiple objectives and agents, where a...
research
03/11/2022

Impression Allocation and Policy Search in Display Advertising

In online display advertising, guaranteed contracts and real-time biddin...

Please sign up or login with your details

Forgot password? Click here to reset