Continuous Value Function Approximation for Sequential Bidding Policies

01/23/2013
by   Craig Boutilier, et al.
0

Market-based mechanisms such as auctions are being studied as an appropriate means for resource allocation in distributed and mulitagent decision problems. When agents value resources in combination rather than in isolation, they must often deliberate about appropriate bidding strategies for a sequence of auctions offering resources of interest. We briefly describe a discrete dynamic programming model for constructing appropriate bidding policies for resources exhibiting both complementarities and substitutability. We then introduce a continuous approximation of this model, assuming that money (or the numeraire good) is infinitely divisible. Though this has the potential to reduce the computational cost of computing policies, value functions in the transformed problem do not have a convenient closed form representation. We develop em grid-based approximation for such value functions, representing value functions using piecewise linear approximations. We show that these methods can offer significant computational savings with relatively small cost in solution quality.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 6

page 9

page 10

research
01/16/2013

Policy Iteration for Factored MDPs

Many large MDPs can be represented compactly using a dynamic Bayesian ne...
research
03/28/2018

One-step dispatching policy improvement in multiple-server queueing systems with Poisson arrivals

Policy iteration techniques for multiple-server dispatching rely on the ...
research
10/31/2011

Optimal and Approximate Q-value Functions for Decentralized POMDPs

Decision-theoretic planning is a popular approach to sequential decision...
research
08/28/2015

Learning Efficient Representations for Reinforcement Learning

Markov decision processes (MDPs) are a well studied framework for solvin...
research
02/14/2012

Symbolic Dynamic Programming for Discrete and Continuous State MDPs

Many real-world decision-theoretic planning problems can be naturally mo...
research
07/11/2012

Exploiting First-Order Regression in Inductive Policy Selection

We consider the problem of computing optimal generalised policies for re...
research
05/17/2019

Non-signaling Approximations of Stochastic Team Problems

In this paper, we consider non-signaling approximation of finite stochas...

Please sign up or login with your details

Forgot password? Click here to reset