Reinforcement Learning using Guided Observability

04/22/2021
by   Stephan Weigand, et al.
0

Due to recent breakthroughs, reinforcement learning (RL) has demonstrated impressive performance in challenging sequential decision-making problems. However, an open question is how to make RL cope with partial observability which is prevalent in many real-world problems. Contrary to contemporary RL approaches, which focus mostly on improved memory representations or strong assumptions about the type of partial observability, we propose a simple but efficient approach that can be applied together with a wide variety of RL methods. Our main insight is that smoothly transitioning from full observability to partial observability during the training process yields a high performance policy. The approach, called partially observable guided reinforcement learning (PO-GRL), allows to utilize full state information during policy optimization without compromising the optimality of the final policy. A comprehensive evaluation in discrete partially observableMarkov decision process (POMDP) benchmark problems and continuous partially observable MuJoCo and OpenAI gym tasks shows that PO-GRL improves performance. Finally, we demonstrate PO-GRL in the ball-in-the-cup task on a real Barrett WAM robot under partial observability.

READ FULL TEXT

page 1

page 5

page 6

page 8

page 9

page 10

page 11

research
05/11/2018

Deep Hierarchical Reinforcement Learning Algorithm in Partially Observable Markov Decision Processes

In recent years, reinforcement learning has achieved many remarkable suc...
research
03/28/2018

Unsupervised Predictive Memory in a Goal-Directed Agent

Animals execute goal-directed behaviours despite the limited range and s...
research
11/03/2022

Leveraging Fully Observable Policies for Learning under Partial Observability

Reinforcement learning in partially observable domains is challenging du...
research
03/03/2023

POPGym: Benchmarking Partially Observable Reinforcement Learning

Real world applications of Reinforcement Learning (RL) are often partial...
research
02/23/2020

Discriminative Particle Filter Reinforcement Learning for Complex Partial Observations

Deep reinforcement learning is successful in decision making for sophist...
research
04/19/2023

End-to-End Policy Gradient Method for POMDPs and Explainable Agents

Real-world decision-making problems are often partially observable, and ...
research
07/13/2021

Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability

Generalization is a central challenge for the deployment of reinforcemen...

Please sign up or login with your details

Forgot password? Click here to reset