POMDP inference and robust solution via deep reinforcement learning: An application to railway optimal maintenance

07/16/2023
by   Giacomo Arcieri, et al.
0

Partially Observable Markov Decision Processes (POMDPs) can model complex sequential decision-making problems under stochastic and uncertain environments. A main reason hindering their broad adoption in real-world applications is the lack of availability of a suitable POMDP model or a simulator thereof. Available solution algorithms, such as Reinforcement Learning (RL), require the knowledge of the transition dynamics and the observation generating process, which are often unknown and non-trivial to infer. In this work, we propose a combined framework for inference and robust solution of POMDPs via deep RL. First, all transition and observation model parameters are jointly inferred via Markov Chain Monte Carlo sampling of a hidden Markov model, which is conditioned on actions, in order to recover full posterior distributions from the available data. The POMDP with uncertain parameters is then solved via deep RL techniques with the parameter distributions incorporated into the solution via domain randomization, in order to develop solutions that are robust to model uncertainty. As a further contribution, we compare the use of transformers and long short-term memory networks, which constitute model-free RL solutions, with a model-based/model-free hybrid approach. We apply these methods to the real-world problem of optimal maintenance planning for railway assets.

READ FULL TEXT
research
12/15/2022

Bridging POMDPs and Bayesian decision making for robust maintenance planning under model uncertainty: An application to railway systems

Structural Health Monitoring (SHM) describes a process for inferring qua...
research
07/29/2021

Lyapunov-based uncertainty-aware safe reinforcement learning

Reinforcement learning (RL) has shown a promising performance in learnin...
research
12/14/2021

Quantifying Multimodality in World Models

Model-based Deep Reinforcement Learning (RL) assumes the availability of...
research
01/04/2019

Optimal Decision-Making in Mixed-Agent Partially Observable Stochastic Environments via Reinforcement Learning

Optimal decision making with limited or no information in stochastic env...
research
04/30/2023

Model-free Motion Planning of Autonomous Agents for Complex Tasks in Partially Observable Environments

Motion planning of autonomous agents in partially known environments wit...
research
01/12/2022

Multi-echelon Supply Chains with Uncertain Seasonal Demands and Lead Times Using Deep Reinforcement Learning

We address the problem of production planning and distribution in multi-...
research
09/20/2022

A Spiking Neural Network Learning Markov Chain

In this paper, the question how spiking neural network (SNN) learns and ...

Please sign up or login with your details

Forgot password? Click here to reset