Model-Based Planning in Discrete Action Spaces

05/19/2017
by   Mikael Henaff, et al.
0

Planning actions using learned and differentiable forward models of the world is a general approach which has a number of desirable properties, including improved sample complexity over model-free RL methods, reuse of learned models across different tasks, and the ability to perform efficient gradient-based optimization in continuous action spaces. However, this approach does not apply straightforwardly when the action space is discrete, which may have limited its adoption. In this work, we introduce two discrete planning tasks inspired by existing question-answering datasets and show that it is in fact possible to effectively perform planning via backprop in discrete action spaces using two simple yet principled modifications. Our experiments show that this approach can significantly outperform model-free RL based methods and supervised imitation learners.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/01/2019

Efficient Model-free Reinforcement Learning in Metric Spaces

Model-free Reinforcement Learning (RL) algorithms such as Q-learning [Wa...
research
02/08/2022

GrASP: Gradient-Based Affordance Selection for Planning

Planning with a learned model is arguably a key component of intelligenc...
research
12/30/2019

World Programs for Model-Based Learning and Planning in Compositional State and Action Spaces

Some of the most important tasks take place in environments which lack c...
research
08/22/2022

Efficient Planning in a Compact Latent Action Space

While planning-based sequence modelling methods have shown great potenti...
research
11/21/2017

Deterministic Policy Optimization by Combining Pathwise and Score Function Estimators for Discrete Action Spaces

Policy optimization methods have shown great promise in solving complex ...
research
04/02/2018

Universal Planning Networks

A key challenge in complex visuomotor control is learning abstract repre...
research
02/03/2023

DiSProD: Differentiable Symbolic Propagation of Distributions for Planning

The paper introduces DiSProD, an online planner developed for environmen...

Please sign up or login with your details

Forgot password? Click here to reset