Performance Guarantees for Homomorphisms Beyond Markov Decision Processes

by   Sultan Javed Majeed, et al.

Most real-world problems have huge state and/or action spaces. Therefore, a naive application of existing tabular solution methods is not tractable on such problems. Nonetheless, these solution methods are quite useful if an agent has access to a relatively small state-action space homomorphism of the true environment and near-optimal performance is guaranteed by the map. A plethora of research is focused on the case when the homomorphism is a Markovian representation of the underlying process. However, we show that near-optimal performance is sometimes guaranteed even if the homomorphism is non-Markovian. Moreover, we can aggregate significantly more states by lifting the Markovian requirement without compromising on performance. In this work, we expand Extreme State Aggregation (ESA) framework to joint state-action aggregations. We also lift the policy uniformity condition for aggregation in ESA that allows even coarser modeling of the true environment.


page 1

page 2

page 3

page 4


Learning Factored Markov Decision Processes with Unawareness

Methods for learning and planning in sequential decision problems often ...

Bayes-CPACE: PAC Optimal Exploration in Continuous Space Bayes-Adaptive Markov Decision Processes

We present the first PAC optimal algorithm for Bayes-Adaptive Markov Dec...

Approximation Methods for Partially Observed Markov Decision Processes (POMDPs)

POMDPs are useful models for systems where the true underlying state is ...

Toward incremental FIB aggregation with quick selections (FAQS)

Several approaches to mitigating the Forwarding Information Base (FIB) o...

Near Optimal Task Graph Scheduling with Priced Timed Automata and Priced Timed Markov Decision Processes

Task graph scheduling is a relevant problem in computer science with app...

Policy Learning for Robust Markov Decision Process with a Mismatched Generative Model

In high-stake scenarios like medical treatment and auto-piloting, it's r...

Multiple Plans are Better than One: Diverse Stochastic Planning

In planning problems, it is often challenging to fully model the desired...