Human-in-the-Loop Deep Reinforcement Learning with Application to Autonomous Driving

by   Jingda Wu, et al.

Due to the limited smartness and abilities of machine intelligence, currently autonomous vehicles are still unable to handle all kinds of situations and completely replace drivers. Because humans exhibit strong robustness and adaptability in complex driving scenarios, it is of great importance to introduce humans into the training loop of artificial intelligence, leveraging human intelligence to further advance machine learning algorithms. In this study, a real-time human-guidance-based deep reinforcement learning (Hug-DRL) method is developed for policy training of autonomous driving. Leveraging a newly designed control transfer mechanism between human and automation, human is able to intervene and correct the agent's unreasonable actions in real time when necessary during the model training process. Based on this human-in-the-loop guidance mechanism, an improved actor-critic architecture with modified policy and value networks is developed. The fast convergence of the proposed Hug-DRL allows real-time human guidance actions to be fused into the agent's training loop, further improving the efficiency and performance of deep reinforcement learning. The developed method is validated by human-in-the-loop experiments with 40 subjects and compared with other state-of-the-art learning approaches. The results suggest that the proposed method can effectively enhance the training efficiency and performance of the deep reinforcement learning algorithm under human guidance, without imposing specific requirements on participant expertise and experience.


page 10

page 15

page 17

page 39


Efficient Deep Reinforcement Learning with Imitative Expert Priors for Autonomous Driving

Deep reinforcement learning (DRL) is a promising way to achieve human-li...

Incorporating Voice Instructions in Model-Based Reinforcement Learning for Self-Driving Cars

This paper presents a novel approach that supports natural language voic...

Adviser Networks: Learning What Question to Ask for Human-In-The-Loop Viewpoint Estimation

Humans have an unparalleled visual intelligence and can overcome visual ...

Biomechanic Posture Stabilisation via Iterative Training of Multi-policy Deep Reinforcement Learning Agents

It is not until we become senior citizens do we recognise how much we to...

Orthogonal Policy Gradient and Autonomous Driving Application

One less addressed issue of deep reinforcement learning is the lack of g...

Autonomous Curiosity for Real-Time Training Onboard Robotic Agents

Learning requires both study and curiosity. A good learner is not only g...

Please sign up or login with your details

Forgot password? Click here to reset