Learning Best Response Strategies for Agents in Ad Exchanges

02/10/2019
by   Stavros Gerakaris, et al.
6

Ad exchanges are widely used in platforms for online display advertising. Autonomous agents operating in these exchanges must learn policies for interacting profitably with a diverse, continually changing, but unknown market. We consider this problem from the perspective of a publisher, strategically interacting with an advertiser through a posted price mechanism. The learning problem for this agent is made difficult by the fact that information is censored, i.e., the publisher knows if an impression is sold but no other quantitative information. We address this problem using the Harsanyi-Bellman Ad Hoc Coordination (HBA) algorithm, which conceptualises this interaction in terms of a Stochastic Bayesian Game and arrives at optimal actions by best responding with respect to probabilistic beliefs maintained over a candidate set of opponent behaviour profiles. We adapt and apply HBA to the censored information setting of ad exchanges. Also, addressing the case of stochastic opponents, we devise a strategy based on a Kaplan-Meier estimator for opponent modelling. We evaluate the proposed method using simulations wherein we show that HBA-KM achieves substantially better competitive ratio and lower variance of return than baselines, including a Q-learning agent and a UCB-based online learning agent, and comparable to the offline optimal algorithm.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/03/2015

A Game-Theoretic Model and Best-Response Learning Method for Ad Hoc Coordination in Multiagent Systems

The ad hoc coordination problem is to design an autonomous agent which i...
research
03/08/2022

On-the-fly Strategy Adaptation for ad-hoc Agent Coordination

Training agents in cooperative settings offers the promise of AI agents ...
research
04/28/2020

Generating and Adapting to Diverse Ad-Hoc Cooperation Agents in Hanab

Hanabi is a cooperative game that brings the problem of modeling other p...
research
07/28/2022

Towards Robust Ad Hoc Teamwork Agents By Creating Diverse Training Teammates

Ad hoc teamwork (AHT) is the problem of creating an agent that must coll...
research
07/08/2019

Diverse Agents for Ad-Hoc Cooperation in Hanabi

In complex scenarios where a model of other actors is necessary to predi...
research
03/02/2021

Efficient Optimal Selection for Composited Advertising Creatives with Tree Structure

Ad creatives are one of the prominent mediums for online e-commerce adve...
research
01/10/2022

Assisting Unknown Teammates in Unknown Tasks: Ad Hoc Teamwork under Partial Observability

In this paper, we present a novel Bayesian online prediction algorithm f...

Please sign up or login with your details

Forgot password? Click here to reset