Performance Guarantees for Homomorphisms Beyond Markov Decision Processes

11/09/2018
by   Sultan Javed Majeed, et al.
2

Most real-world problems have huge state and/or action spaces. Therefore, a naive application of existing tabular solution methods is not tractable on such problems. Nonetheless, these solution methods are quite useful if an agent has access to a relatively small state-action space homomorphism of the true environment and near-optimal performance is guaranteed by the map. A plethora of research is focused on the case when the homomorphism is a Markovian representation of the underlying process. However, we show that near-optimal performance is sometimes guaranteed even if the homomorphism is non-Markovian. Moreover, we can aggregate significantly more states by lifting the Markovian requirement without compromising on performance. In this work, we expand Extreme State Aggregation (ESA) framework to joint state-action aggregations. We also lift the policy uniformity condition for aggregation in ESA that allows even coarser modeling of the true environment.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/27/2019

Learning Factored Markov Decision Processes with Unawareness

Methods for learning and planning in sequential decision problems often ...
research
10/06/2018

Bayes-CPACE: PAC Optimal Exploration in Continuous Space Bayes-Adaptive Markov Decision Processes

We present the first PAC optimal algorithm for Bayes-Adaptive Markov Dec...
research
08/31/2021

Approximation Methods for Partially Observed Markov Decision Processes (POMDPs)

POMDPs are useful models for systems where the true underlying state is ...
research
11/01/2021

Decentralized Cooperative Reinforcement Learning with Hierarchical Information Structure

Multi-agent reinforcement learning (MARL) problems are challenging due t...
research
02/25/2020

Near Optimal Task Graph Scheduling with Priced Timed Automata and Priced Timed Markov Decision Processes

Task graph scheduling is a relevant problem in computer science with app...
research
12/13/2018

Toward incremental FIB aggregation with quick selections (FAQS)

Several approaches to mitigating the Forwarding Information Base (FIB) o...
research
02/10/2016

Iterative Hierarchical Optimization for Misspecified Problems (IHOMP)

For complex, high-dimensional Markov Decision Processes (MDPs), it may b...

Please sign up or login with your details

Forgot password? Click here to reset