Intermittently Observable Markov Decision Processes

02/23/2023
by   Gongpu Chen, et al.
0

This paper investigates MDPs with intermittent state information. We consider a scenario where the controller perceives the state information of the process via an unreliable communication channel. The transmissions of state information over the whole time horizon are modeled as a Bernoulli lossy process. Hence, the problem is finding an optimal policy for selecting actions in the presence of state information losses. We first formulate the problem as a belief MDP to establish structural results. The effect of state information losses on the expected total discounted reward is studied systematically. Then, we reformulate the problem as a tree MDP whose state space is organized in a tree structure. Two finite-state approximations to the tree MDP are developed to find near-optimal policies efficiently. Finally, we put forth a nested value iteration algorithm for the finite-state approximations, which is proved to be faster than standard value iteration. Numerical results demonstrate the effectiveness of our methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/13/2018

On the Complexity of Value Iteration

Value iteration is a fundamental algorithm for solving Markov Decision P...
research
05/17/2018

Memoryless Exact Solutions for Deterministic MDPs with Sparse Rewards

We propose an algorithm for deterministic continuous Markov Decision Pro...
research
01/16/2014

Topological Value Iteration Algorithms

Value iteration is a powerful yet inefficient algorithm for Markov decis...
research
08/05/2022

Planning under periodic observations: bounds and bounding-based solutions

We study planning problems faced by robots operating in uncertain enviro...
research
10/15/2020

Near Optimality of Finite Memory Feedback Policies in Partially Observed Markov Decision Processes

In the theory of Partially Observed Markov Decision Processes (POMDPs), ...
research
06/30/2011

Restricted Value Iteration: Theory and Algorithms

Value iteration is a popular algorithm for finding near optimal policies...
research
06/03/2013

Improved and Generalized Upper Bounds on the Complexity of Policy Iteration

Given a Markov Decision Process (MDP) with n states and a totalnumber m ...

Please sign up or login with your details

Forgot password? Click here to reset