Metareasoning for Planning Under Uncertainty

05/03/2015
by   Christopher H. Lin, et al.
0

The conventional model for online planning under uncertainty assumes that an agent can stop and plan without incurring costs for the time spent planning. However, planning time is not free in most real-world settings. For example, an autonomous drone is subject to nature's forces, like gravity, even while it thinks, and must either pay a price for counteracting these forces to stay in place, or grapple with the state change caused by acquiescing to them. Policy optimization in these settings requires metareasoning---a process that trades off the cost of planning and the potential policy improvement that can be achieved. We formalize and analyze the metareasoning problem for Markov Decision Processes (MDPs). Our work subsumes previously studied special cases of metareasoning and shows that in the general case, metareasoning is at most polynomially harder than solving MDPs with any given algorithm that disregards the cost of thinking. For reasons we discuss, optimal general metareasoning turns out to be impractical, motivating approximations. We present approximate metareasoning procedures which rely on special properties of the BRTDP planning algorithm and explore the effectiveness of our methods on a variety of problems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/28/2022

Efficient Policy Iteration for Robust Markov Decision Processes via Regularization

Robust Markov decision processes (MDPs) provide a general framework to m...
research
10/25/2021

Lexicographic Optimisation of Conditional Value at Risk and Expected Value for Risk-Averse Planning in MDPs

Planning in Markov decision processes (MDPs) typically optimises the exp...
research
04/25/2023

The Update Equivalence Framework for Decision-Time Planning

The process of revising (or constructing) a policy immediately prior to ...
research
04/19/2018

Algorithms and Conditional Lower Bounds for Planning Problems

We consider planning problems for graphs, Markov decision processes (MDP...
research
05/07/2018

Planning and Learning with Stochastic Action Sets

In many practical uses of reinforcement learning (RL) the set of actions...
research
07/04/2012

Counterexample-guided Planning

Planning in adversarial and uncertain environments can be modeled as the...
research
09/30/2019

Learning Compact Models for Planning with Exogenous Processes

We address the problem of approximate model minimization for MDPs in whi...

Please sign up or login with your details

Forgot password? Click here to reset