Action Robust Reinforcement Learning and Applications in Continuous Control

01/26/2019
by   Chen Tessler, et al.
0

A policy is said to be robust if it maximizes the reward while considering a bad, or even adversarial, model. In this work we formalize two new criteria of robustness to action uncertainty. Specifically, we consider two scenarios in which the agent attempts to perform an action a, and (i) with probability α, an alternative adversarial action a̅ is taken, or (ii) an adversary adds a perturbation to the selected action in the case of continuous action space. We show that our criteria are related to common forms of uncertainty in robotics domains, such as the occurrence of abrupt forces, and suggest algorithms in the tabular case. Building on the suggested algorithms, we generalize our approach to deep reinforcement learning (DRL) and provide extensive experiments in the various MuJoCo domains. Our experiments show that not only does our approach produce robust policies, but it also improves the performance in the absence of perturbations. This generalization indicates that action-robustness can be thought of as implicit regularization in RL problems.

READ FULL TEXT

page 7

page 25

page 26

research
03/19/2020

Robust Deep Reinforcement Learning against Adversarial Perturbations on Observations

Deep Reinforcement Learning (DRL) is vulnerable to small adversarial per...
research
12/06/2022

What is the Solution for State-Adversarial Multi-Agent Reinforcement Learning?

Various methods for Multi-Agent Reinforcement Learning (MARL) have been ...
research
07/01/2020

Falsification-Based Robust Adversarial Reinforcement Learning

Reinforcement learning (RL) has achieved tremendous progress in solving ...
research
06/17/2021

CROP: Certifying Robust Policies for Reinforcement Learning through Functional Smoothing

We present the first framework of Certifying Robust Policies for reinfor...
research
03/08/2017

Robust Adversarial Reinforcement Learning

Deep neural networks coupled with fast simulation and improved computati...
research
10/05/2021

Continuous-Time Fitted Value Iteration for Robust Policies

Solving the Hamilton-Jacobi-Bellman equation is important in many domain...
research
10/28/2019

Certified Adversarial Robustness for Deep Reinforcement Learning

Deep Neural Network-based systems are now the state-of-the-art in many r...

Please sign up or login with your details

Forgot password? Click here to reset