Dynamic Planning Networks

12/28/2018
by   Norman Tasfi, et al.
0

We introduce Dynamic Planning Networks (DPN), a novel architecture for deep reinforcement learning, that combines model-based and model-free aspects for online planning. Our architecture learns to dynamically construct plans using a learned state-transition model by selecting and traversing between simulated states and actions to maximize valuable information before acting. In contrast to model-free methods, model-based planning lets the agent efficiently test action hypotheses without performing costly trial-and-error in the environment. DPN learns to efficiently form plans by expanding a single action-conditional state transition at a time instead of exhaustively evaluating each action, reducing the required number of state-transitions during planning by up to 96 We observe various emergent planning patterns used to solve environments, including classical search methods such as breadth-first and depth-first search. Learning To Plan shows improved data efficiency, performance, and generalization to new and unseen domains in comparison to several baselines.

READ FULL TEXT

page 7

page 9

research
07/19/2017

Imagination-Augmented Agents for Deep Reinforcement Learning

We introduce Imagination-Augmented Agents (I2As), a novel architecture f...
research
07/11/2017

Value Prediction Network

This paper proposes a novel deep reinforcement learning (RL) architectur...
research
02/22/2021

Action Redundancy in Reinforcement Learning

Maximum Entropy (MaxEnt) reinforcement learning is a powerful learning p...
research
01/24/2022

Generative Planning for Temporally Coordinated Exploration in Reinforcement Learning

Standard model-free reinforcement learning algorithms optimize a policy ...
research
12/05/2019

Combining Q-Learning and Search with Amortized Value Estimates

We introduce "Search with Amortized Value Estimates" (SAVE), an approach...
research
07/09/2021

Safe Learning of Lifted Action Models

Creating a domain model, even for classical, domain-independent planning...
research
10/03/2020

Episodic Memory for Learning Subjective-Timescale Models

In model-based learning, an agent's model is commonly defined over trans...

Please sign up or login with your details

Forgot password? Click here to reset