Memory-based Deep Reinforcement Learning for POMDP

02/24/2021
by   Lingheng Meng, et al.
0

A promising characteristic of Deep Reinforcement Learning (DRL) is its capability to learn optimal policy in an end-to-end manner without relying on feature engineering. However, most approaches assume a fully observable state space, i.e. fully observable Markov Decision Process (MDP). In real-world robotics, this assumption is unpractical, because of the sensor issues such as sensors' capacity limitation and sensor noise, and the lack of knowledge about if the observation design is complete or not. These scenarios lead to Partially Observable MDP (POMDP) and need special treatment. In this paper, we propose Long-Short-Term-Memory-based Twin Delayed Deep Deterministic Policy Gradient (LSTM-TD3) by introducing a memory component to TD3, and compare its performance with other DRL algorithms in both MDPs and POMDPs. Our results demonstrate the significant advantages of the memory component in addressing POMDPs, including the ability to handle missing and noisy observation data.

READ FULL TEXT

page 1

page 6

research
06/27/2021

Graph Convolutional Memory for Deep Reinforcement Learning

Solving partially-observable Markov decision processes (POMDPs) is criti...
research
10/31/2020

Pseudo Random Number Generation through Reinforcement Learning and Recurrent Neural Networks

A Pseudo-Random Number Generator (PRNG) is any algorithm generating a se...
research
11/27/2022

Applying Deep Reinforcement Learning to the HP Model for Protein Structure Prediction

A central problem in computational biophysics is protein structure predi...
research
08/30/2022

Effective Multi-User Delay-Constrained Scheduling with Deep Recurrent Reinforcement Learning

Multi-user delay constrained scheduling is important in many real-world ...
research
08/13/2017

Belief Tree Search for Active Object Recognition

Active Object Recognition (AOR) has been approached as an unsupervised l...
research
06/16/2021

How memory architecture affects performance and learning in simple POMDPs

Reinforcement learning is made much more complex when the agent's observ...
research
02/07/2020

Dynamic Energy Dispatch in Isolated Microgrids Based on Deep Reinforcement Learning

This paper focuses on deep reinforcement learning (DRL)-based energy dis...

Please sign up or login with your details

Forgot password? Click here to reset