Dynamic deep-reinforcement-learning algorithm in Partially Observed Markov Decision Processes

07/29/2023
by   Saki Omi, et al.
0

Reinforcement learning has been greatly improved in recent studies and an increased interest in real-world implementation has emerged in recent years. In many cases, due to the non-static disturbances, it becomes challenging for the agent to keep the performance. The disturbance results in the environment called Partially Observable Markov Decision Process. In common practice, Partially Observable Markov Decision Process is handled by introducing an additional estimator, or Recurrent Neural Network is utilized in the context of reinforcement learning. Both of the cases require to process sequential information on the trajectory. However, there are only a few studies investigating the effect of information to consider and the network structure to handle them. This study shows the benefit of action sequence inclusion in order to solve Partially Observable Markov Decision Process. Several structures and approaches are proposed to extend one of the latest deep reinforcement learning algorithms with LSTM networks. The developed algorithms showed enhanced robustness of controller performance against different types of external disturbances that are added to observation.

READ FULL TEXT

page 9

page 11

page 13

research
05/11/2018

Deep Hierarchical Reinforcement Learning Algorithm in Partially Observable Markov Decision Processes

In recent years, reinforcement learning has achieved many remarkable suc...
research
03/11/2019

Deep Recurrent Q-Learning vs Deep Q-Learning on a simple Partially Observable Markov Decision Process with Minecraft

Deep Q-Learning has been successfully applied to a wide variety of tasks...
research
05/25/2023

Markov Decision Process with an External Temporal Process

Most reinforcement learning algorithms treat the context under which the...
research
11/15/2021

The Partially Observable History Process

We introduce the partially observable history process (POHP) formalism f...
research
12/12/2012

Reinforcement Learning with Partially Known World Dynamics

Reinforcement learning would enjoy better success on real-world problems...
research
02/01/2023

Deep reinforcement learning for the olfactory search POMDP: a quantitative benchmark

The olfactory search POMDP (partially observable Markov decision process...
research
03/31/2022

Mask Atari for Deep Reinforcement Learning as POMDP Benchmarks

We present Mask Atari, a new benchmark to help solve partially observabl...

Please sign up or login with your details

Forgot password? Click here to reset