Exploiting First-Order Regression in Inductive Policy Selection

07/11/2012
by   Charles Gretton, et al.
0

We consider the problem of computing optimal generalised policies for relational Markov decision processes. We describe an approach combining some of the benefits of purely inductive techniques with those of symbolic dynamic programming methods. The latter reason about the optimal value function using first-order decision theoretic regression and formula rewriting, while the former, when provided with a suitable hypotheses language, are capable of generalising value functions or policies for small instances. Our idea is to use reasoning and in particular classical first-order regression to automatically generate a hypotheses language dedicated to the domain at hand, which is then used as input by an inductive solver. This approach avoids the more complex reasoning of symbolic dynamic programming while focusing the inductive solver's attention on concepts that are specifically relevant to the optimal value function for the domain considered.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/11/2012

Dynamic Programming for Structured Continuous Markov Decision Problems

We describe an approach for exploiting structure in Markov Decision Proc...
research
12/12/2012

Inductive Policy Selection for First-Order MDPs

We select policies for large Markov Decision Processes (MDPs) with compa...
research
02/14/2012

Symbolic Dynamic Programming for Discrete and Continuous State MDPs

Many real-world decision-theoretic planning problems can be naturally mo...
research
06/10/2020

Fitted Q-Learning for Relational Domains

We consider the problem of Approximate Dynamic Programming in relational...
research
10/31/2011

First Order Decision Diagrams for Relational MDPs

Markov decision processes capture sequential decision making under uncer...
research
01/01/2022

The Parametric Cost Function Approximation: A new approach for multistage stochastic programming

The most common approaches for solving multistage stochastic programming...
research
01/23/2013

Continuous Value Function Approximation for Sequential Bidding Policies

Market-based mechanisms such as auctions are being studied as an appropr...

Please sign up or login with your details

Forgot password? Click here to reset