Wasserstein Robust Reinforcement Learning

07/30/2019
by   Mohammed Amin Abdullah, et al.
2

Reinforcement learning algorithms, though successful, tend to over-fit to training environments hampering their application to the real-world. This paper proposes WR^2L; a robust reinforcement learning algorithm with significant robust performance on low and high-dimensional control tasks. Our method formalises robust reinforcement learning as a novel min-max game with a Wasserstein constraint for a correct and convergent solver. Apart from the formulation, we also propose an efficient and scalable solver following a novel zero-order optimisation method that we believe can be useful to numerical optimisation in general. We contribute both theoretically and empirically. On the theory side, we prove that WR^2L converges to a stationary point in the general setting of continuous state and action spaces. Empirically, we demonstrate significant gains compared to standard and robust state-of-the-art algorithms on high-dimensional MuJuCo environments.

READ FULL TEXT

page 18

page 19

research
06/12/2020

A Brief Look at Generalization in Visual Meta-Reinforcement Learning

Due to the realization that deep reinforcement learning algorithms train...
research
06/14/2022

Robust Reinforcement Learning with Distributional Risk-averse formulation

Robust Reinforcement Learning tries to make predictions more robust to c...
research
11/03/2022

Reinforcement Learning in Non-Markovian Environments

Following the novel paradigm developed by Van Roy and coauthors for rein...
research
12/29/2022

On the Geometry of Reinforcement Learning in Continuous State and Action Spaces

Advances in reinforcement learning have led to its successful applicatio...
research
11/08/2022

A deep solver for BSDEs with jumps

The aim of this work is to propose an extension of the Deep BSDE solver ...
research
06/01/2020

Robust Reinforcement Learning with Wasserstein Constraint

Robust Reinforcement Learning aims to find the optimal policy with some ...
research
03/14/2016

Exploratory Gradient Boosting for Reinforcement Learning in Complex Domains

High-dimensional observations and complex real-world dynamics present ma...

Please sign up or login with your details

Forgot password? Click here to reset