Interval Markov Decision Processes with Continuous Action-Spaces

11/02/2022
by   Giannis Delimpaltadakis, et al.
0

Interval Markov Decision Processes (IMDPs) are uncertain Markov models, where the transition probabilities belong to intervals. Recently, there has been a surge of research on employing IMDPs as abstractions of stochastic systems for control synthesis. However, due to the absence of algorithms for synthesis over IMDPs with continuous action-spaces, the action-space is assumed discrete a-priori, which is a restrictive assumption for many applications. Motivated by this, we introduce continuous-action IMDPs (caIMDPs), where the bounds on transition probabilities are functions of the action variables, and study value iteration for maximizing expected cumulative rewards. Specifically, we show that solving the max-min problem associated to value iteration is equivalent to solving |𝒬| max problems, where |𝒬| is the number of states of the caIMDP. Then, exploiting the simple form of these max problems, we identify cases where value iteration over caIMDPs can be solved efficiently (e.g., with linear or convex programming). We also gain other interesting insights: e.g., in the case where the action set 𝒜 is a polytope and the transition bounds are linear, synthesizing over a discrete-action IMDP, where the actions are the vertices of 𝒜, is sufficient for optimality. We demonstrate our results on a numerical example. Finally, we include a short discussion on employing caIMDPs as abstractions for control synthesis.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/06/2013

Fast Value Iteration for Goal-Directed Markov Decision Processes

Planning problems where effects of actions are non-deterministic can be ...
research
04/19/2018

Nonparametric Stochastic Compositional Gradient Descent for Q-Learning in Continuous Markov Decision Problems

We consider Markov Decision Problems defined over continuous state and a...
research
10/16/2012

An Approximate Solution Method for Large Risk-Averse Markov Decision Processes

Stochastic domains often involve risk-averse decision makers. While rece...
research
07/11/2012

Solving Factored MDPs with Continuous and Discrete Variables

Although many real-world stochastic planning problems are more naturally...
research
07/27/2022

Satisfiability Bounds for ω-Regular Properties in Bounded-Parameter Markov Decision Processes

We consider the problem of computing minimum and maximum probabilities o...
research
06/20/2019

Max-Plus Matching Pursuit for Deterministic Markov Decision Processes

We consider deterministic Markov decision processes (MDPs) and apply max...
research
06/06/2023

Finding Counterfactually Optimal Action Sequences in Continuous State Spaces

Humans performing tasks that involve taking a series of multiple depende...

Please sign up or login with your details

Forgot password? Click here to reset