Provably Efficient Model-based Policy Adaptation

by   Yuda Song, et al.

The high sample complexity of reinforcement learning challenges its use in practice. A promising approach is to quickly adapt pre-trained policies to new environments. Existing methods for this policy adaptation problem typically rely on domain randomization and meta-learning, by sampling from some distribution of target environments during pre-training, and thus face difficulty on out-of-distribution target environments. We propose new model-based mechanisms that are able to make online adaptation in unseen target environments, by combining ideas from no-regret online learning and adaptive control. We prove that the approach learns policies in the target environment that can quickly recover trajectories from the source environment, and establish the rate of convergence in general settings. We demonstrate the benefits of our approach for policy adaptation in a diverse set of continuous control tasks, achieving the performance of state-of-the-art methods with much lower sample complexity.


page 3

page 8

page 16

page 20


Meta-Model-Based Meta-Policy Optimization

Model-based reinforcement learning (MBRL) has been applied to meta-learn...

Learning to Adapt: Meta-Learning for Model-Based Control

Although reinforcement learning methods can achieve impressive results i...

A Model-based Approach for Sample-efficient Multi-task Reinforcement Learning

The aim of multi-task reinforcement learning is two-fold: (1) efficientl...

Adaptive Prior Selection for Repertoire-based Online Learning in Robotics

Among the data-efficient approaches for online adaptation in robotics (m...

Online Adaptation through Meta-Learning for Stereo Depth Estimation

In this work, we tackle the problem of online adaptation for stereo dept...

A Meta Reinforcement Learning-based Approach for Self-Adaptive System

A self-learning adaptive system (SLAS) uses machine learning to enable a...

Efficient Deep Learning of Robust, Adaptive Policies using Tube MPC-Guided Data Augmentation

The deployment of agile autonomous systems in challenging, unstructured ...

Please sign up or login with your details

Forgot password? Click here to reset