Revisiting Model-based Value Expansion

03/28/2022
by   Daniel Palenicek, et al.
0

Model-based value expansion methods promise to improve the quality of value function targets and, thereby, the effectiveness of value function learning. However, to date, these methods are being outperformed by Dyna-style algorithms with conceptually simpler 1-step value function targets. This shows that in practice, the theoretical justification of value expansion does not seem to hold. We provide a thorough empirical study to shed light on the causes of failure of value expansion methods in practice which is believed to be the compounding model error. By leveraging GPU based physics simulators, we are able to efficiently use the true dynamics for analysis inside the model-based reinforcement learning loop. Performing extensive comparisons between true and learned dynamics sheds light into this black box. This paper provides a better understanding of the actual problems in value expansion. We provide future directions of research by empirically testing the maximum theoretical performance of current approaches.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/07/2023

Diminishing Return of Value Expansion Methods in Model-Based Reinforcement Learning

Model-based reinforcement learning is one approach to increase sample ef...
research
02/02/2023

Is Model Ensemble Necessary? Model-based RL via a Single Model with Lipschitz Regularized Value Function

Probabilistic dynamics model ensemble is widely used in existing model-b...
research
05/16/2020

Model-Augmented Actor-Critic: Backpropagating through Paths

Current model-based reinforcement learning approaches use the model simp...
research
07/30/2021

Maximum Entropy Dueling Network Architecture

In recent years, there have been many deep structures for Reinforcement ...
research
04/19/2018

Lipschitz Continuity in Model-based Reinforcement Learning

Model-based reinforcement-learning methods learn transition and reward m...
research
09/21/2020

Dynamic Horizon Value Estimation for Model-based Reinforcement Learning

Existing model-based value expansion methods typically leverage a world ...
research
02/14/2020

Frequency-based Search-control in Dyna

Model-based reinforcement learning has been empirically demonstrated as ...

Please sign up or login with your details

Forgot password? Click here to reset