Safe Reinforcement Learning with Dual Robustness

09/13/2023
by   Zeyang Li, et al.
0

Reinforcement learning (RL) agents are vulnerable to adversarial disturbances, which can deteriorate task performance or compromise safety specifications. Existing methods either address safety requirements under the assumption of no adversary (e.g., safe RL) or only focus on robustness against performance adversaries (e.g., robust RL). Learning one policy that is both safe and robust remains a challenging open problem. The difficulty is how to tackle two intertwined aspects in the worst cases: feasibility and optimality. Optimality is only valid inside a feasible region, while identification of maximal feasible region must rely on learning the optimal policy. To address this issue, we propose a systematic framework to unify safe RL and robust RL, including problem formulation, iteration scheme, convergence analysis and practical algorithm design. This unification is built upon constrained two-player zero-sum Markov games. A dual policy iteration scheme is proposed, which simultaneously optimizes a task policy and a safety policy. The convergence of this iteration scheme is proved. Furthermore, we design a deep RL algorithm for practical implementation, called dually robust actor-critic (DRAC). The evaluations with safety-critical benchmarks demonstrate that DRAC achieves high performance and persistent safety under all scenarios (no adversary, safety adversary, performance adversary), outperforming all baselines significantly.

READ FULL TEXT

page 10

page 11

page 13

research
04/20/2022

SAAC: Safe Reinforcement Learning as an Adversarial Game of Actor-Critics

Although Reinforcement Learning (RL) is effective for sequential decisio...
research
04/18/2023

Feasible Policy Iteration

Safe reinforcement learning (RL) aims to solve an optimal control proble...
research
05/22/2021

Feasible Actor-Critic: Constrained Reinforcement Learning for Ensuring Statewise Safety

The safety constraints commonly used by existing safe reinforcement lear...
research
06/12/2023

Robust Reinforcement Learning through Efficient Adversarial Herding

Although reinforcement learning (RL) is considered the gold standard for...
research
08/19/2021

Prior Is All You Need to Improve the Robustness and Safety for the First Time Deployment of Meta RL

The field of Meta Reinforcement Learning (Meta-RL) has seen substantial ...
research
02/14/2023

Regret-Based Optimization for Robust Reinforcement Learning

Deep Reinforcement Learning (DRL) policies have been shown to be vulnera...
research
06/29/2023

Probabilistic Constraint for Safety-Critical Reinforcement Learning

In this paper, we consider the problem of learning safe policies for pro...

Please sign up or login with your details

Forgot password? Click here to reset