Safety-Constrained Policy Transfer with Successor Features

11/10/2022
by   Zeyu Feng, et al.
0

In this work, we focus on the problem of safe policy transfer in reinforcement learning: we seek to leverage existing policies when learning a new task with specified constraints. This problem is important for safety-critical applications where interactions are costly and unconstrained policies can lead to undesirable or dangerous outcomes, e.g., with physical robots that interact with humans. We propose a Constrained Markov Decision Process (CMDP) formulation that simultaneously enables the transfer of policies and adherence to safety constraints. Our formulation cleanly separates task goals from safety considerations and permits the specification of a wide variety of constraints. Our approach relies on a novel extension of generalized policy improvement to constrained settings via a Lagrangian formulation. We devise a dual optimization algorithm that estimates the optimal dual variable of a target task, thus enabling safe transfer of policies derived from successor features learned on source tasks. Our experiments in simulated domains show that our approach is effective; it visits unsafe states less frequently and outperforms alternative state-of-the-art methods when taking safety constraints into account.

READ FULL TEXT

page 5

page 6

page 14

research
11/20/2019

Safe Policies for Reinforcement Learning via Primal-Dual Methods

In this paper, we study the learning of safe policies in the setting of ...
research
10/10/2020

Robust Constrained-MDPs: Soft-Constrained Robust Policy Optimization under Model Uncertainty

In this paper, we focus on the problem of robustifying reinforcement lea...
research
06/21/2023

State-wise Constrained Policy Optimization

Reinforcement Learning (RL) algorithms have shown tremendous success in ...
research
02/10/2023

Hierarchical Motion Planning under Probabilistic Temporal Tasks and Safe-Return Constraints

Safety is crucial for robotic missions within an uncertain environment. ...
research
06/24/2021

Density Constrained Reinforcement Learning

We study constrained reinforcement learning (CRL) from a novel perspecti...
research
01/24/2022

Constrained Policy Optimization via Bayesian World Models

Improving sample-efficiency and safety are crucial challenges when deplo...
research
06/11/2022

A virtual environment for formulation of policy packages

The interdependence and complexity of socio-technical systems and availa...

Please sign up or login with your details

Forgot password? Click here to reset