Model-Based Reinforcement Learning via Meta-Policy Optimization

09/14/2018
by   Ignasi Clavera, et al.
34

Model-based reinforcement learning approaches carry the promise of being data efficient. However, due to challenges in learning dynamics models that sufficiently match the real-world dynamics, they struggle to achieve the same asymptotic performance as model-free methods. We propose Model-Based Meta-Policy-Optimization (MB-MPO), an approach that foregoes the strong reliance on accurate learned dynamics models. Using an ensemble of learned dynamic models, MB-MPO meta-learns a policy that can quickly adapt to any model in the ensemble with one policy gradient step. This steers the meta-policy towards internalizing consistent dynamics predictions among the ensemble while shifting the burden of behaving optimally w.r.t. the model discrepancies towards the adaptation step. Our experiments show that MB-MPO is more robust to model imperfections than previous model-based approaches. Finally, we demonstrate that our approach is able to match the asymptotic performance of model-free methods while requiring significantly less experience.

READ FULL TEXT

page 7

page 12

research
11/21/2020

Double Meta-Learning for Data Efficient Policy Optimization in Non-Stationary Environments

We are interested in learning models of non-stationary environments, whi...
research
04/05/2022

Model Based Meta Learning of Critics for Policy Gradients

Being able to seamlessly generalize across different tasks is fundamenta...
research
02/16/2021

Model-based Meta Reinforcement Learning using Graph Structured Surrogate Models

Reinforcement learning is a promising paradigm for solving sequential de...
research
10/28/2019

Asynchronous Methods for Model-Based Reinforcement Learning

Significant progress has been made in the area of model-based reinforcem...
research
04/16/2020

A Game Theoretic Framework for Model Based Reinforcement Learning

Model-based reinforcement learning (MBRL) has recently gained immense in...
research
04/11/2018

Personalized Dynamics Models for Adaptive Assistive Navigation Interfaces

We explore the role of personalization for assistive navigational system...
research
10/14/2019

Bootstrapping the Expressivity with Model-based Planning

We compare the model-free reinforcement learning with the model-based ap...

Please sign up or login with your details

Forgot password? Click here to reset