Factoring Exogenous State for Model-Free Monte Carlo

03/28/2017
by   Sean McGregor, et al.
0

Policy analysts wish to visualize a range of policies for large simulator-defined Markov Decision Processes (MDPs). One visualization approach is to invoke the simulator to generate on-policy trajectories and then visualize those trajectories. When the simulator is expensive, this is not practical, and some method is required for generating trajectories for new policies without invoking the simulator. The method of Model-Free Monte Carlo (MFMC) can do this by stitching together state transitions for a new policy based on previously-sampled trajectories from other policies. This "off-policy Monte Carlo simulation" method works well when the state space has low dimension but fails as the dimension grows. This paper describes a method for factoring out some of the state and action variables so that MFMC can work in high-dimensional MDPs. The new method, MFMCi, is evaluated on a very challenging wildfire management MDP.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/25/2020

Simulation Based Algorithms for Markov Decision Processes and Multi-Action Restless Bandits

We consider multi-dimensional Markov decision processes and formulate a ...
research
01/28/2023

Variational Latent Branching Model for Off-Policy Evaluation

Model-based methods have recently shown great potential for off-policy e...
research
08/15/2023

Formally-Sharp DAgger for MCTS: Lower-Latency Monte Carlo Tree Search using Data Aggregation with Formal Methods

We study how to efficiently combine formal methods, Monte Carlo Tree Sea...
research
09/04/2018

Vulcan: A Monte Carlo Algorithm for Large Chance Constrained MDPs with Risk Bounding Functions

Chance Constrained Markov Decision Processes maximize reward subject to ...
research
05/30/2022

Critic Sequential Monte Carlo

We introduce CriticSMC, a new algorithm for planning as inference built ...
research
01/23/2014

Plan-based Policies for Efficient Multiple Battery Load Management

Efficient use of multiple batteries is a practical problem with wide and...
research
05/09/2012

New inference strategies for solving Markov Decision Processes using reversible jump MCMC

In this paper we build on previous work which uses inferences techniques...

Please sign up or login with your details

Forgot password? Click here to reset