Policy Gradients for Probabilistic Constrained Reinforcement Learning

10/02/2022
by   Weiqin Chen, et al.
0

This paper considers the problem of learning safe policies in the context of reinforcement learning (RL). In particular, a safe policy or controller is one that, with high probability, maintains the trajectory of the agent in a given safe set. We relate this notion of safety to the notion of average safety often considered in the literature by providing theoretical bounds in terms of their safety and performance. The challenge of working with the probabilistic notion of safety considered in this work is the lack of expressions for their gradients. Indeed, policy optimization algorithms rely on gradients of the objective function and the constraints. To the best of our knowledge, this work is the first one providing such explicit gradient expressions for probabilistic constraints. It is worth noting that such probabilistic gradients are naturally algorithm independent, which provides possibilities for them to be applied to various policy-based algorithms. In addition, we consider a continuous navigation problem to empirically illustrate the advantages (in terms of safety and performance) of working with probabilistic constraints as compared to average constraints.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/29/2023

Probabilistic Constraint for Safety-Critical Reinforcement Learning

In this paper, we consider the problem of learning safe policies for pro...
research
03/06/2023

Safe Reinforcement Learning via Probabilistic Logic Shields

Safe Reinforcement learning (Safe RL) aims at learning optimal policies ...
research
04/06/2016

Safe Probability

We formalize the idea of probability distributions that lead to reliable...
research
07/27/2023

Evaluation of Safety Constraints in Autonomous Navigation with Deep Reinforcement Learning

While reinforcement learning algorithms have had great success in the fi...
research
04/02/2020

Safe Reinforcement Learning via Projection on a Safe Set: How to Achieve Optimality?

For all its successes, Reinforcement Learning (RL) still struggles to de...
research
01/28/2022

Constrained Variational Policy Optimization for Safe Reinforcement Learning

Safe reinforcement learning (RL) aims to learn policies that satisfy cer...
research
10/11/2020

Safe Reinforcement Learning with Natural Language Constraints

In this paper, we tackle the problem of learning control policies for ta...

Please sign up or login with your details

Forgot password? Click here to reset