Deep Hierarchical Reinforcement Learning Algorithm in Partially Observable Markov Decision Processes

05/11/2018
by   Le Pham Tuyen, et al.
0

In recent years, reinforcement learning has achieved many remarkable successes due to the growing adoption of deep learning techniques and the rapid growth in computing power. Nevertheless, it is well-known that flat reinforcement learning algorithms are often not able to learn well and data-efficient in tasks having hierarchical structures, e.g. consisting of multiple subtasks. Hierarchical reinforcement learning is a principled approach that is able to tackle these challenging tasks. On the other hand, many real-world tasks usually have only partial observability in which state measurements are often imperfect and partially observable. The problems of RL in such settings can be formulated as a partially observable Markov decision process (POMDP). In this paper, we study hierarchical RL in POMDP in which the tasks have only partial observability and possess hierarchical properties. We propose a hierarchical deep reinforcement learning approach for learning in hierarchical POMDP. The deep hierarchical RL algorithm is proposed to apply to both MDP and POMDP learning. We evaluate the proposed algorithm on various challenging hierarchical POMDP.

READ FULL TEXT

page 10

page 11

page 12

research
07/29/2023

Dynamic deep-reinforcement-learning algorithm in Partially Observed Markov Decision Processes

Reinforcement learning has been greatly improved in recent studies and a...
research
04/22/2021

Reinforcement Learning using Guided Observability

Due to recent breakthroughs, reinforcement learning (RL) has demonstrate...
research
04/02/2022

Hierarchical Reinforcement Learning under Mixed Observability

The framework of mixed observable Markov decision processes (MOMDP) mode...
research
06/22/2021

Reinforcement Learning for Physical Layer Communications

In this chapter, we will give comprehensive examples of applying RL in o...
research
09/21/2019

Deep Reinforcement Learning with Modulated Hebbian plus Q Network Architecture

This paper introduces the modulated Hebbian plus Q network architecture ...
research
06/16/2023

Automatic Deduction Path Learning via Reinforcement Learning with Environmental Correction

Automatic bill payment is an important part of business operations in fi...
research
03/29/2021

Robust Reinforcement Learning under model misspecification

Reinforcement learning has achieved remarkable performance in a wide ran...

Please sign up or login with your details

Forgot password? Click here to reset