Model-based Meta Reinforcement Learning using Graph Structured Surrogate Models

02/16/2021
by   Qi Wang, et al.
21

Reinforcement learning is a promising paradigm for solving sequential decision-making problems, but low data efficiency and weak generalization across tasks are bottlenecks in real-world applications. Model-based meta reinforcement learning addresses these issues by learning dynamics and leveraging knowledge from prior experience. In this paper, we take a closer look at this framework, and propose a new Thompson-sampling based approach that consists of a new model to identify task dynamics together with an amortized policy optimization step. We show that our model, called a graph structured surrogate model (GSSM), outperforms state-of-the-art methods in predicting environment dynamics. Additionally, our approach is able to obtain high returns, while allowing fast execution during deployment by avoiding test time policy gradient optimization.

READ FULL TEXT

page 3

page 5

page 7

page 9

page 10

page 13

page 15

page 16

research
09/14/2018

Model-Based Reinforcement Learning via Meta-Policy Optimization

Model-based reinforcement learning approaches carry the promise of being...
research
04/05/2022

Model Based Meta Learning of Critics for Policy Gradients

Being able to seamlessly generalize across different tasks is fundamenta...
research
02/13/2020

Effective Reinforcement Learning through Evolutionary Surrogate-Assisted Prescription

There is now significant historical data available on decision making in...
research
10/15/2021

Improving Hyperparameter Optimization by Planning Ahead

Hyperparameter optimization (HPO) is generally treated as a bi-level opt...
research
12/06/2021

ED2: An Environment Dynamics Decomposition Framework for World Model Construction

Model-based reinforcement learning methods achieve significant sample ef...
research
10/09/2022

Decomposed Mutual Information Optimization for Generalized Context in Meta-Reinforcement Learning

Adapting to the changes in transition dynamics is essential in robotic a...
research
05/13/2021

Policy Optimization in Bayesian Network Hybrid Models of Biomanufacturing Processes

Biopharmaceutical manufacturing is a rapidly growing industry with impac...

Please sign up or login with your details

Forgot password? Click here to reset