Optimistic Policy Iteration for MDPs with Acyclic Transient State Structure

01/29/2021
by   Joseph Lubars, et al.
0

We consider Markov Decision Processes (MDPs) in which every stationary policy induces the same graph structure for the underlying Markov chain and further, the graph has the following property: if we replace each recurrent class by a node, then the resulting graph is acyclic. For such MDPs, we prove the convergence of the stochastic dynamics associated with a version of optimistic policy iteration (OPI), suggested in Tsitsiklis (2002), in which the values associated with all the nodes visited during each iteration of the OPI are updated.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/28/2022

Efficient Policy Iteration for Robust Markov Decision Processes via Regularization

Robust Markov decision processes (MDPs) provide a general framework to m...
research
01/23/2013

A Method for Speeding Up Value Iteration in Partially Observable Markov Decision Processes

We present a technique for speeding up the convergence of value iteratio...
research
07/06/2017

Efficient Strategy Iteration for Mean Payoff in Markov Decision Processes

Markov decision processes (MDPs) are standard models for probabilistic s...
research
10/12/2021

Twice regularized MDPs and the equivalence between robustness and regularization

Robust Markov decision processes (MDPs) aim to handle changing or partia...
research
11/28/2022

Some Upper Bounds on the Running Time of Policy Iteration on Deterministic MDPs

Policy Iteration (PI) is a widely used family of algorithms to compute o...
research
04/19/2020

Faster Algorithms for Quantitative Analysis of Markov Chains and Markov Decision Processes with Small Treewidth

Discrete-time Markov Chains (MCs) and Markov Decision Processes (MDPs) a...
research
10/13/2022

Reinforcement Learning with Unbiased Policy Evaluation and Linear Function Approximation

We provide performance guarantees for a variant of simulation-based poli...

Please sign up or login with your details

Forgot password? Click here to reset