Efficient transfer learning and online adaptation with latent variable models for continuous control

12/08/2018
by   Christian F. Perez, et al.
0

Traditional model-based RL relies on hand-specified or learned models of transition dynamics of the environment. These methods are sample efficient and facilitate learning in the real world but fail to generalize to subtle variations in the underlying dynamics, e.g., due to differences in mass, friction, or actuators across robotic agents or across time. We propose using variational inference to learn an explicit latent representation of unknown environment properties that accelerates learning and facilitates generalization on novel environments at test time. We use Online Bayesian Inference of these learned latents to rapidly adapt online to changes in environments without retaining large replay buffers of recent data. Combined with a neural network ensemble that models dynamics and captures uncertainty over dynamics, our approach demonstrates positive transfer during training and online adaptation on the continuous control task HalfCheetah.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/30/2018

Learning to Adapt: Meta-Learning for Model-Based Control

Although reinforcement learning methods can achieve impressive results i...
research
02/08/2020

Generalized Hidden Parameter MDPs Transferable Model-based RL in a Handful of Trials

There is broad interest in creating RL agents that can solve many (relat...
research
02/22/2018

Intrinsic Motivation and Mental Replay enable Efficient Online Adaptation in Stochastic Recurrent Networks

Autonomous robots need to interact with unknown, unstructured and changi...
research
10/17/2019

Single Episode Policy Transfer in Reinforcement Learning

Transfer and adaptation to new unknown environmental dynamics is a key c...
research
08/23/2022

Generating people flow from architecture of real unseen environments

Mapping people dynamics is a crucial skill, because it enables robots to...
research
02/22/2023

Energy-Based Test Sample Adaptation for Domain Generalization

In this paper, we propose energy-based sample adaptation at test time fo...
research
08/31/2023

RePo: Resilient Model-Based Reinforcement Learning by Regularizing Posterior Predictability

Visual model-based RL methods typically encode image observations into l...

Please sign up or login with your details

Forgot password? Click here to reset