POMDPs in Continuous Time and Discrete Spaces

10/02/2020
by   Bastian Alt, et al.
0

Many processes, such as discrete event systems in engineering or population dynamics in biology, evolve in discrete space and continuous time. We consider the problem of optimal decision making in such discrete state and action space systems under partial observability. This places our work at the intersection of optimal filtering and optimal control. At the current state of research, a mathematical description for simultaneous decision making and filtering in continuous time with finite countable state and action spaces is still missing. In this paper, we give a mathematical description of a continuous-time POMDP. By leveraging optimal filtering theory we derive a HJB type equation that characterizes the optimal solution. Using techniques from deep learning we approximately solve the resulting partial integro-differential equation. We present (i) an approach solving the decision problem offline by learning an approximation of the value function and (ii) an online algorithm which provides a solution in belief space using deep reinforcement learning. We show the applicability on a set of toy examples which pave the way for future methods providing solutions for high dimensional problems.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 7

page 8

page 18

page 19

12/23/2019

Hamilton-Jacobi-Bellman Equations for Q-Learning in Continuous Time

In this paper, we introduce Hamilton-Jacobi-Bellman (HJB) equations for ...
06/04/2011

Optimal Reinforcement Learning for Gaussian Systems

The exploration-exploitation trade-off is among the central challenges o...
05/07/2019

Optimal Control of Complex Systems through Variational Inference with a Discrete Event Decision Process

Complex social systems are composed of interconnected individuals whose ...
02/18/2018

Estimating scale-invariant future in continuous time

Natural learners must compute an estimate of future outcomes that follow...
06/20/2021

Optimal Strategies for Decision Theoretic Online Learning

We extend the drifting games analysis to continuous time and show that t...
02/14/2012

Factored Filtering of Continuous-Time Systems

We consider filtering for a continuous-time, or asynchronous, stochastic...
12/17/2018

Double Deep Q-Learning for Optimal Execution

Optimal trade execution is an important problem faced by essentially all...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.