Deep Reinforcement Learning with Feedback-based Exploration

03/14/2019
by   Jan Scholten, et al.
0

Deep Reinforcement Learning has enabled the control of increasingly complex and high-dimensional problems. However, the need of vast amounts of data before reasonable performance is attained prevents its widespread application. We employ binary corrective feedback as a general and intuitive manner to incorporate human intuition and domain knowledge in model-free machine learning. The uncertainty in the policy and the corrective feedback is combined directly in the action space as probabilistic conditional exploration. As a result, the greatest part of the otherwise ignorant learning process can be avoided. We demonstrate the proposed method, Predictive Probabilistic Merging of Policies (PPMP), in combination with DDPG. In experiments on continuous control problems of the OpenAI Gym, we achieve drastic improvements in sample efficiency, final performance, and robustness to erroneous feedback, both for human and synthetic feedback. Additionally, we show solutions beyond the demonstrated knowledge.

READ FULL TEXT
research
09/12/2017

Explore, Exploit or Listen: Combining Human Feedback and Policy Model to Speed up Deep Reinforcement Learning in 3D Worlds

We describe a method to use discrete human feedback to enhance the perfo...
research
03/12/2019

Learning Gaussian Policies from Corrective Human Feedback

Learning from human feedback is a viable alternative to control design t...
research
03/16/2020

Particle-Based Adaptive Discretization for Continuous Control using Deep Reinforcement Learning

Learning controls in high-dimensional continuous action spaces, such as ...
research
10/02/2019

Stabilizing Off-Policy Reinforcement Learning with Conservative Policy Gradients

In recent years, advances in deep learning have enabled the application ...
research
10/06/2017

Rainbow: Combining Improvements in Deep Reinforcement Learning

The deep reinforcement learning community has made several independent i...
research
12/13/2021

Contextual Exploration Using a Linear Approximation Method Based on Satisficing

Deep reinforcement learning has enabled human-level or even super-human ...
research
01/02/2023

Deep reinforcement learning for irrigation scheduling using high-dimensional sensor feedback

Deep reinforcement learning has considerable potential to improve irriga...

Please sign up or login with your details

Forgot password? Click here to reset