Reinforcement Learning by Guided Safe Exploration

07/26/2023
by   Qisong Yang, et al.
0

Safety is critical to broadening the application of reinforcement learning (RL). Often, we train RL agents in a controlled environment, such as a laboratory, before deploying them in the real world. However, the real-world target task might be unknown prior to deployment. Reward-free RL trains an agent without the reward to adapt quickly once the reward is revealed. We consider the constrained reward-free setting, where an agent (the guide) learns to explore safely without the reward signal. This agent is trained in a controlled environment, which allows unsafe interactions and still provides the safety signal. After the target task is revealed, safety violations are not allowed anymore. Thus, the guide is leveraged to compose a safe behaviour policy. Drawing from transfer learning, we also regularize a target policy (the student) towards the guide while the student is unreliable and gradually eliminate the influence of the guide as training progresses. The empirical analysis shows that this method can achieve safe transfer learning and helps the student solve the target task faster.

READ FULL TEXT
research
12/27/2021

Safe Reinforcement Learning with Chance-constrained Model Predictive Control

Real-world reinforcement learning (RL) problems often demand that agents...
research
10/20/2022

Safe Policy Improvement in Constrained Markov Decision Processes

The automatic synthesis of a policy through reinforcement learning (RL) ...
research
04/24/2021

Constraint-Guided Reinforcement Learning: Augmenting the Agent-Environment-Interaction

Reinforcement Learning (RL) agents have great successes in solving tasks...
research
02/07/2023

Adaptive Aggregation for Safety-Critical Control

Safety has been recognized as the central obstacle to preventing the use...
research
05/29/2022

On the Robustness of Safe Reinforcement Learning under Observational Perturbations

Safe reinforcement learning (RL) trains a policy to maximize the task re...
research
09/21/2018

Constrained Exploration and Recovery from Experience Shaping

We consider the problem of reinforcement learning under safety requireme...
research
01/20/2022

Safe Deep RL in 3D Environments using Human Feedback

Agents should avoid unsafe behaviour during both training and deployment...

Please sign up or login with your details

Forgot password? Click here to reset