DeepAI
Log In Sign Up

Augmented World Models Facilitate Zero-Shot Dynamics Generalization From a Single Offline Environment

04/12/2021
by   Philip J. Ball, et al.
0

Reinforcement learning from large-scale offline datasets provides us with the ability to learn policies without potentially unsafe or impractical exploration. Significant progress has been made in the past few years in dealing with the challenge of correcting for differing behavior between the data collection and learned policies. However, little attention has been paid to potentially changing dynamics when transferring a policy to the online setting, where performance can be up to 90 this paper we address this problem with Augmented World Models (AugWM). We augment a learned dynamics model with simple transformations that seek to capture potential changes in physical properties of the robot, leading to more robust policies. We not only train our policy in this new setting, but also provide it with the sampled augmentation as a context, allowing it to adapt to changes in the environment. At test time we learn the context in a self-supervised fashion by approximating the augmentation which corresponds to the new environment. We rigorously evaluate our approach on over 100 different changed dynamics settings, and show that this simple approach can significantly improve the zero-shot generalization of a recent state-of-the-art baseline, often achieving successful policies where the baseline fails.

READ FULL TEXT

page 12

page 13

10/04/2019

Zero Shot Learning on Simulated Robots

In this work we present a method for leveraging data from one source to ...
06/01/2022

Model Generation with Provable Coverability for Offline Reinforcement Learning

Model-based offline optimization with dynamics-aware policy provides a n...
10/27/2022

Towards Reliable Zero Shot Classification in Self-Supervised Models with Conformal Prediction

Self-supervised models trained with a contrastive loss such as CLIP have...
03/10/2021

S4RL: Surprisingly Simple Self-Supervision for Offline Reinforcement Learning

Offline reinforcement learning proposes to learn policies from large col...
11/02/2021

Koopman Q-learning: Offline Reinforcement Learning via Symmetries of Dynamics

Offline reinforcement learning leverages large datasets to train policie...
06/01/2021

What Can I Do Here? Learning New Skills by Imagining Visual Affordances

A generalist robot equipped with learned skills must be able to perform ...
04/12/2022

Offline Distillation for Robot Lifelong Learning with Imbalanced Experience

Robots will experience non-stationary environment dynamics throughout th...