Safe Driving via Expert Guided Policy Optimization

10/13/2021
by   Zhenghao Peng, et al.
0

When learning common skills like driving, beginners usually have domain experts standing by to ensure the safety of the learning process. We formulate such learning scheme under the Expert-in-the-loop Reinforcement Learning where a guardian is introduced to safeguard the exploration of the learning agent. While allowing the sufficient exploration in the uncertain environment, the guardian intervenes under dangerous situations and demonstrates the correct actions to avoid potential accidents. Thus ERL enables both exploration and expert's partial demonstration as two training sources. Following such a setting, we develop a novel Expert Guided Policy Optimization (EGPO) method which integrates the guardian in the loop of reinforcement learning. The guardian is composed of an expert policy to generate demonstration and a switch function to decide when to intervene. Particularly, a constrained optimization technique is used to tackle the trivial solution that the agent deliberately behaves dangerously to deceive the expert into taking over. Offline RL technique is further used to learn from the partial demonstration generated by the expert. Safe driving experiments show that our method achieves superior training and test-time safety, outperforms baselines with a substantial margin in sample efficiency, and preserves the generalizabiliy to unseen environments in test-time. Demo video and source code are available at: https://decisionforce.github.io/EGPO/

READ FULL TEXT

page 6

page 12

page 16

research
02/17/2022

Efficient Learning of Safe Driving Policy via Human-AI Copilot Optimization

Human intervention is an effective way to inject human knowledge into th...
research
09/18/2023

Guided Online Distillation: Promoting Safe Reinforcement Learning by Offline Demonstration

Safe Reinforcement Learning (RL) aims to find a policy that achieves hig...
research
05/24/2022

Learning to Drive Using Sparse Imitation Reinforcement Learning

In this paper, we propose Sparse Imitation Reinforcement Learning (SIRL)...
research
02/20/2023

Demonstration-Guided Reinforcement Learning with Efficient Exploration for Task Automation of Surgical Robot

Task automation of surgical robot has the potentials to improve surgical...
research
10/28/2019

Learning Transferable Graph Exploration

This paper considers the problem of efficient exploration of unseen envi...
research
10/16/2017

Gradient-free Policy Architecture Search and Adaptation

We develop a method for policy architecture search and adaptation via gr...
research
01/07/2021

CoachNet: An Adversarial Sampling Approach for Reinforcement Learning

Despite the recent successes of reinforcement learning in games and robo...

Please sign up or login with your details

Forgot password? Click here to reset