Feasible Actor-Critic: Constrained Reinforcement Learning for Ensuring Statewise Safety

05/22/2021
by   Haitong Ma, et al.
0

The safety constraints commonly used by existing safe reinforcement learning (RL) methods are defined only on expectation of initial states, but allow each certain state to be unsafe, which is unsatisfying for real-world safety-critical tasks. In this paper, we introduce the feasible actor-critic (FAC) algorithm, which is the first model-free constrained RL method that considers statewise safety, e.g, safety for each initial state. We claim that some states are inherently unsafe no matter what policy we choose, while for other states there exist policies ensuring safety, where we say such states and policies are feasible. By constructing a statewise Lagrange function available on RL sampling and adopting an additional neural network to approximate the statewise Lagrange multiplier, we manage to obtain the optimal feasible policy which ensures safety for each feasible state and the safest possible policy for infeasible states. Furthermore, the trained multiplier net can indicate whether a given state is feasible or not through the statewise complementary slackness condition. We provide theoretical guarantees that FAC outperforms previous expectation-based constrained RL methods in terms of both constraint satisfaction and reward optimization. Experimental results on both robot locomotive tasks and safe exploration tasks verify the safety enhancement and feasibility interpretation of the proposed method.

READ FULL TEXT
research
03/07/2023

A Multiplicative Value Function for Safe and Efficient Reinforcement Learning

An emerging field of sequential decision problems is safe Reinforcement ...
research
04/08/2023

A Barrier-Lyapunov Actor-Critic Reinforcement Learning Approach for Safe and Stable Control

Reinforcement learning (RL) has demonstrated impressive performance in v...
research
05/16/2022

Reachability Constrained Reinforcement Learning

Constrained reinforcement learning (CRL) has gained significant interest...
research
12/19/2020

Model-Based Actor-Critic with Chance Constraint for Stochastic System

Safety constraints are essential for reinforcement learning (RL) applied...
research
09/13/2023

Safe Reinforcement Learning with Dual Robustness

Reinforcement learning (RL) agents are vulnerable to adversarial disturb...
research
04/18/2023

Feasible Policy Iteration

Safe reinforcement learning (RL) aims to solve an optimal control proble...
research
02/22/2020

Guided Constrained Policy Optimization for Dynamic Quadrupedal Robot Locomotion

Deep reinforcement learning (RL) uses model-free techniques to optimize ...

Please sign up or login with your details

Forgot password? Click here to reset