Robust Policy Learning over Multiple Uncertainty Sets

02/14/2022
by   Annie Xie, et al.
2

Reinforcement learning (RL) agents need to be robust to variations in safety-critical environments. While system identification methods provide a way to infer the variation from online experience, they can fail in settings where fast identification is not possible. Another dominant approach is robust RL which produces a policy that can handle worst-case scenarios, but these methods are generally designed to achieve robustness to a single uncertainty set that must be specified at train time. Towards a more general solution, we formulate the multi-set robustness problem to learn a policy robust to different perturbation sets. We then design an algorithm that enjoys the benefits of both system identification and robust RL: it reduces uncertainty where possible given a few interactions, but can still act robustly with respect to the remaining uncertainty. On a diverse set of control tasks, our approach demonstrates improved worst-case performance on new environments compared to prior methods based on system identification and on robust RL alone.

READ FULL TEXT
research
02/23/2019

Distributionally Robust Reinforcement Learning

Generalization to unknown/uncertain environments of reinforcement learni...
research
06/04/2021

Robustifying Reinforcement Learning Policies with ℒ_1 Adaptive Control

A reinforcement learning (RL) policy trained in a nominal environment co...
research
10/12/2022

Efficient Adversarial Training without Attacking: Worst-Case-Aware Robust Reinforcement Learning

Recent studies reveal that a well-trained deep reinforcement learning (R...
research
05/20/2019

A Bayesian Approach to Robust Reinforcement Learning

Robust Markov Decision Processes (RMDPs) intend to ensure robustness wit...
research
07/04/2014

Robust Optimization using Machine Learning for Uncertainty Sets

Our goal is to build robust optimization problems for making decisions b...
research
10/27/2020

One Solution is Not All You Need: Few-Shot Extrapolation via Structured MaxEnt RL

While reinforcement learning algorithms can learn effective policies for...
research
10/14/2022

Distributed Distributionally Robust Optimization with Non-Convex Objectives

Distributionally Robust Optimization (DRO), which aims to find an optima...

Please sign up or login with your details

Forgot password? Click here to reset