On the role of planning in model-based deep reinforcement learning

by   Jessica B. Hamrick, et al.

Model-based planning is often thought to be necessary for deep, careful reasoning and generalization in artificial agents. While recent successes of model-based reinforcement learning (MBRL) with deep function approximation have strengthened this hypothesis, the resulting diversity of model-based methods has also made it difficult to track which components drive success and why. In this paper, we seek to disentangle the contributions of recent methods by focusing on three questions: (1) How does planning benefit MBRL agents? (2) Within planning, what choices drive performance? (3) To what extent does planning improve generalization? To answer these questions, we study the performance of MuZero (Schrittwieser et al., 2019), a state-of-the-art MBRL algorithm, under a number of interventions and ablations and across a wide range of environments including control tasks, Atari, and 9x9 Go. Our results suggest the following: (1) The primary benefit of planning is in driving policy learning. (2) Using shallow trees with simple Monte-Carlo rollouts is as performant as more complex methods, except in the most difficult reasoning tasks. (3) Planning alone is insufficient to drive strong generalization. These results indicate where and how to utilize planning in reinforcement learning settings, and highlight a number of open questions for future MBRL research.


page 19

page 20

page 21


A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

We present an end-to-end, model-based deep reinforcement learning agent ...

Procedural Generalization by Planning with Self-Supervised World Models

One of the key promises of model-based reinforcement learning is the abi...

Assessing Policy, Loss and Planning Combinations in Reinforcement Learning using a New Modular Architecture

The model-based reinforcement learning paradigm, which uses planning alg...

Model-Based Deep Reinforcement Learning for High-Dimensional Problems, a Survey

Deep reinforcement learning has shown remarkable success in the past few...

Understanding Decision-Time vs. Background Planning in Model-Based Reinforcement Learning

In model-based reinforcement learning, an agent can leverage a learned m...

Affordance-based Reinforcement Learning for Urban Driving

Traditional autonomous vehicle pipelines that follow a modular approach ...

Psychlab: A Psychology Laboratory for Deep Reinforcement Learning Agents

Psychlab is a simulated psychology laboratory inside the first-person 3D...