Combining Pessimism with Optimism for Robust and Efficient Model-Based Deep Reinforcement Learning

03/18/2021
by   Sebastian Curi, et al.
19

In real-world tasks, reinforcement learning (RL) agents frequently encounter situations that are not present during training time. To ensure reliable performance, the RL agents need to exhibit robustness against worst-case situations. The robust RL framework addresses this challenge via a worst-case optimization between an agent and an adversary. Previous robust RL algorithms are either sample inefficient, lack robustness guarantees, or do not scale to large problems. We propose the Robust Hallucinated Upper-Confidence RL (RH-UCRL) algorithm to provably solve this problem while attaining near-optimal sample complexity guarantees. RH-UCRL is a model-based reinforcement learning (MBRL) algorithm that effectively distinguishes between epistemic and aleatoric uncertainty and efficiently explores both the agent and adversary decision spaces during policy learning. We scale RH-UCRL to complex tasks via neural networks ensemble models as well as neural network policies. Experimentally, we demonstrate that RH-UCRL outperforms other robust deep RL algorithms in a variety of adversarial environments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/12/2022

Efficient Adversarial Training without Attacking: Worst-Case-Aware Robust Reinforcement Learning

Recent studies reveal that a well-trained deep reinforcement learning (R...
research
02/15/2022

User-Oriented Robust Reinforcement Learning

Recently, improving the robustness of policies across different environm...
research
08/07/2020

Towards Sample Efficient Agents through Algorithmic Alignment

Deep reinforcement-learning agents have demonstrated great success on va...
research
02/06/2023

Robust Subtask Learning for Compositional Generalization

Compositional reinforcement learning is a promising approach for trainin...
research
02/05/2021

Provably Efficient Algorithms for Multi-Objective Competitive RL

We study multi-objective reinforcement learning (RL) where an agent's re...
research
06/09/2021

Who Is the Strongest Enemy? Towards Optimal and Efficient Evasion Attacks in Deep RL

Evaluating the worst-case performance of a reinforcement learning (RL) a...
research
08/05/2020

Robust Deep Reinforcement Learning through Adversarial Loss

Deep neural networks, including reinforcement learning agents, have been...

Please sign up or login with your details

Forgot password? Click here to reset