Value Iteration Networks

02/09/2016
by   Aviv Tamar, et al.
0

We introduce the value iteration network (VIN): a fully differentiable neural network with a `planning module' embedded within. VINs can learn to plan, and are suitable for predicting outcomes that involve planning-based reasoning, such as policies for reinforcement learning. Key to our approach is a novel differentiable approximation of the value-iteration algorithm, which can be represented as a convolutional neural network, and trained end-to-end using standard backpropagation. We evaluate VIN based policies on discrete and continuous path-planning domains, and on a natural-language based search task. We show that by learning an explicit planning computation, VIN policies generalize better to new, unseen domains.

READ FULL TEXT

page 6

page 10

research
06/08/2017

Generalized Value Iteration Networks: Life Beyond Lattices

In this paper, we introduce a generalized value iteration network (GVIN)...
research
05/28/2018

Value Propagation Networks

We present Value Propagation (VProp), a parameter-efficient differentiab...
research
12/28/2016

The Predictron: End-To-End Learning and Planning

One of the key challenges of artificial intelligence is to learn models ...
research
09/26/2020

Graph neural induction of value iteration

Many reinforcement learning tasks can benefit from explicit planning bas...
research
09/17/2017

Memory Augmented Control Networks

Planning problems in partially observable environments cannot be solved ...
research
06/08/2022

Integrating Symmetry into Differentiable Planning

We study how group symmetry helps improve data efficiency and generaliza...
research
10/11/2021

Neural Algorithmic Reasoners are Implicit Planners

Implicit planning has emerged as an elegant technique for combining lear...

Please sign up or login with your details

Forgot password? Click here to reset