Distributionally Robust Reinforcement Learning

02/23/2019
by   Elena Smirnova, et al.
0

Generalization to unknown/uncertain environments of reinforcement learning algorithms is crucial for real-world applications. In this work, we explicitly consider uncertainty associated with the test environment through an uncertainty set. We formulate the Distributionally Robust Reinforcement Learning (DR-RL) objective that consists in maximizing performance against a worst-case policy in uncertainty set centered at the reference policy. Based on this objective, we derive computationally efficient policy improvement algorithm that benefits from Distributionally Robust Optimization (DRO) guarantees. Further, we propose an iterative procedure that increases stability of learning, called Distributionally Robust Policy Iteration. Combined with maximum entropy framework, we derive a distributionally robust variant of Soft Q-learning that enjoys efficient practical implementation and produces policies with robust behaviour at test time. Our formulation provides a unified view on a number of safe RL algorithms and recent empirical successes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/14/2022

Robust Policy Learning over Multiple Uncertainty Sets

Reinforcement learning (RL) agents need to be robust to variations in sa...
research
02/19/2022

Doubly Robust Distributionally Robust Off-Policy Evaluation and Learning

Off-policy evaluation and learning (OPE/L) use offline observational dat...
research
06/18/2019

Robust Reinforcement Learning for Continuous Control with Model Misspecification

We provide a framework for incorporating robustness -- to perturbations ...
research
08/04/2020

Robust Uncertainty-Aware Multiview Triangulation

We propose a robust and efficient method for multiview triangulation and...
research
08/10/2022

Robust Reinforcement Learning using Offline Data

The goal of robust reinforcement learning (RL) is to learn a policy that...
research
10/29/2019

Robust Model-free Reinforcement Learning with Multi-objective Bayesian Optimization

In reinforcement learning (RL), an autonomous agent learns to perform co...
research
03/13/2023

Path Planning using Reinforcement Learning: A Policy Iteration Approach

With the impact of real-time processing being realized in the recent pas...

Please sign up or login with your details

Forgot password? Click here to reset