Of Cores: A Partial-Exploration Framework for Markov Decision Processes

06/17/2019
by   Jan Křetínský, et al.
0

We introduce a framework for approximate analysis of Markov decision processes (MDP) with bounded-, unbounded-, and infinite-horizon properties. The main idea is to identify a "core" of an MDP, i.e., a subsystem where we provably remain with high probability, and to avoid computation on the less relevant rest of the state space. Although we identify the core using simulations and statistical techniques, it allows for rigorous error bounds in the analysis. Consequently, we obtain efficient analysis algorithms based on partial exploration for various settings, including the challenging case of strongly connected systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/27/2021

Finite Horizon Q-learning: Stability, Convergence and Simulations

Q-learning is a popular reinforcement learning algorithm. This algorithm...
research
08/21/2020

Refined Analysis of FPL for Adversarial Markov Decision Processes

We consider the adversarial Markov Decision Process (MDP) problem, where...
research
04/05/2016

Bounded Optimal Exploration in MDP

Within the framework of probably approximately correct Markov decision p...
research
03/12/2021

On Incorporating Forecasts into Linear State Space Model Markov Decision Processes

Weather forecast information will very likely find increasing applicatio...
research
07/25/2018

Continuous-Time Markov Decisions based on Partial Exploration

We provide a framework for speeding up algorithms for time-bounded reach...
research
07/10/2020

Improved Analysis of UCRL2 with Empirical Bernstein Inequality

We consider the problem of exploration-exploitation in communicating Mar...
research
09/26/2013

Approximation of Lorenz-Optimal Solutions in Multiobjective Markov Decision Processes

This paper is devoted to fair optimization in Multiobjective Markov Deci...

Please sign up or login with your details

Forgot password? Click here to reset