Robust Constrained Reinforcement Learning for Continuous Control with Model Misspecification

10/20/2020
by   Daniel J. Mankowitz, et al.
0

Many real-world physical control systems are required to satisfy constraints upon deployment. Furthermore, real-world systems are often subject to effects such as non-stationarity, wear-and-tear, uncalibrated sensors and so on. Such effects effectively perturb the system dynamics and can cause a policy trained successfully in one domain to perform poorly when deployed to a perturbed version of the same domain. This can affect a policy's ability to maximize future rewards as well as the extent to which it satisfies constraints. We refer to this as constrained model misspecification. We present an algorithm with theoretical guarantees that mitigates this form of misspecification, and showcase its performance in multiple Mujoco tasks from the Real World Reinforcement Learning (RWRL) suite.

READ FULL TEXT

page 8

page 20

page 21

research
06/09/2020

Constrained episodic reinforcement learning in concave-convex and knapsack settings

We propose an algorithm for tabular episodic reinforcement learning with...
research
08/23/2021

Network control by a constrained external agent as a continuous optimization problem

Social science studies dealing with control in networks typically resort...
research
12/03/2019

SafeLife 1.0: Exploring Side Effects in Complex Environments

We present SafeLife, a publicly available reinforcement learning environ...
research
03/24/2020

An empirical investigation of the challenges of real-world reinforcement learning

Reinforcement learning (RL) has proven its worth in a series of artifici...
research
07/25/2022

Optimizing Empty Container Repositioning and Fleet Deployment via Configurable Semi-POMDPs

With the continuous growth of the global economy and markets, resource i...
research
02/12/2019

Value constrained model-free continuous control

The naive application of Reinforcement Learning algorithms to continuous...
research
03/20/2019

Batch Policy Learning under Constraints

When learning policies for real-world domains, two important questions a...

Please sign up or login with your details

Forgot password? Click here to reset