Safe Reinforcement Learning for Autonomous Vehicles through Parallel Constrained Policy Optimization

03/03/2020
by   Lu Wen, et al.
0

Reinforcement learning (RL) is attracting increasing interests in autonomous driving due to its potential to solve complex classification and control problems. However, existing RL algorithms are rarely applied to real vehicles for two predominant problems: behaviours are unexplainable, and they cannot guarantee safety under new scenarios. This paper presents a safe RL algorithm, called Parallel Constrained Policy Optimization (PCPO), for two autonomous driving tasks. PCPO extends today's common actor-critic architecture to a three-component learning framework, in which three neural networks are used to approximate the policy function, value function and a newly added risk function, respectively. Meanwhile, a trust region constraint is added to allow large update steps without breaking the monotonic improvement condition. To ensure the feasibility of safety constrained problems, synchronized parallel learners are employed to explore different state spaces, which accelerates learning and policy-update. The simulations of two scenarios for autonomous vehicles confirm we can ensure safety while achieving fast learning.

READ FULL TEXT

page 5

page 6

research
03/16/2022

How to Learn from Risk: Explicit Risk-Utility Reinforcement Learning for Efficient and Safe Driving Strategies

Autonomous driving has the potential to revolutionize mobility and is he...
research
06/17/2022

SafeRL-Kit: Evaluating Efficient Reinforcement Learning Methods for Safe Autonomous Driving

Safe reinforcement learning (RL) has achieved significant success on ris...
research
03/02/2021

Model-based Constrained Reinforcement Learning using Generalized Control Barrier Function

Model information can be used to predict future trajectories, so it has ...
research
11/26/2019

Deep adaptive dynamic programming for nonaffine nonlinear optimal control problem with state constraints

This paper presents a constrained deep adaptive dynamic programming (CDA...
research
04/18/2023

Feasible Policy Iteration

Safe reinforcement learning (RL) aims to solve an optimal control proble...
research
01/20/2023

On Multi-Agent Deep Deterministic Policy Gradients and their Explainability for SMARTS Environment

Multi-Agent RL or MARL is one of the complex problems in Autonomous Driv...
research
10/01/2021

Motion Planning for Autonomous Vehicles in the Presence of Uncertainty Using Reinforcement Learning

Motion planning under uncertainty is one of the main challenges in devel...

Please sign up or login with your details

Forgot password? Click here to reset