MBVI: Model-Based Value Initialization for Reinforcement Learning

11/04/2020
by   Xubo Lyu, et al.
0

Model-free reinforcement learning (RL) is capable of learning control policies for high-dimensional, complex robotic tasks, but tends to be data inefficient. Model-based RL and optimal control have been proven to be much more data-efficient if an accurate model of the system and environment is known, but can be difficult to scale to expressive models for high-dimensional problems. In this paper, we propose a novel approach to alleviate data inefficiency of model-free RL by warm-starting the learning process using model-based solutions. We do so by initializing a high-dimensional value function via supervision from a low-dimensional value function obtained by applying model-based techniques on a low-dimensional problem featuring an approximate system model. Therefore, our approach exploits the model priors from a simplified problem space implicitly and avoids the direct use of high-dimensional, expressive models. We demonstrate our approach on two representative robotic learning tasks and observe significant improvements in performance and efficiency, and analyze our method empirically with a third task.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

03/23/2019

TTR-Based Rewards for Reinforcement Learning with Implicit Model Priors

Model-free reinforcement learning (RL) provides an attractive approach f...
12/10/2020

Blending MPC Value Function Approximation for Efficient Reinforcement Learning

Model-Predictive Control (MPC) is a powerful tool for controlling comple...
01/03/2019

Self-supervised Learning of Image Embedding for Continuous Control

Operating directly from raw high dimensional sensory inputs like images ...
12/02/2021

Residual Pathway Priors for Soft Equivariance Constraints

There is often a trade-off between building deep learning systems that a...
02/14/2020

Frequency-based Search-control in Dyna

Model-based reinforcement learning has been empirically demonstrated as ...
01/21/2022

Tensor and Matrix Low-Rank Value-Function Approximation in Reinforcement Learning

Value-function (VF) approximation is a central problem in Reinforcement ...
02/21/2017

Towards a Common Implementation of Reinforcement Learning for Multiple Robotic Tasks

Mobile robots are increasingly being employed for performing complex tas...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.