QMDP-Net: Deep Learning for Planning under Partial Observability

03/20/2017
by   Peter Karkus, et al.
0

This paper introduces the QMDP-net, a neural network architecture for planning under partial observability. The QMDP-net combines the strengths of model-free learning and model-based planning. It is a recurrent policy network, but it represents a policy for a parameterized set of tasks by connecting a model with a planning algorithm that solves the model, thus embedding the solution structure of planning in a network learning architecture. The QMDP-net is fully differentiable and allows for end-to-end training. We train a QMDP-net on different tasks so that it can generalize to new ones in the parameterized task set and "transfer" to other similar tasks beyond the set. In preliminary experiments, QMDP-net showed strong performance on several robotic tasks in simulation. Interestingly, while QMDP-net encodes the QMDP algorithm, it sometimes outperforms the QMDP algorithm in the experiments, as a result of end-to-end learning.

READ FULL TEXT
research
06/29/2017

Path Integral Networks: End-to-End Differentiable Optimal Control

In this paper, we introduce Path Integral Networks (PI-Net), a recurrent...
research
09/23/2019

Inducing Hypernym Relationships Based On Order Theory

This paper introduces Strict Partial Order Networks (SPON), a novel neur...
research
10/16/2017

Intention-Net: Integrating Planning and Deep Learning for Goal-Directed Autonomous Navigation

How can a delivery robot navigate reliably to a destination in a new off...
research
05/28/2019

Differentiable Algorithm Networks for Composable Robot Learning

This paper introduces the Differentiable Algorithm Network (DAN), a comp...
research
09/09/2015

Continuous control with deep reinforcement learning

We adapt the ideas underlying the success of Deep Q-Learning to the cont...
research
05/23/2018

Particle Filter Networks with Application to Visual Localization

Particle filtering is a powerful method for sequential state estimation ...
research
12/28/2016

The Predictron: End-To-End Learning and Planning

One of the key challenges of artificial intelligence is to learn models ...

Please sign up or login with your details

Forgot password? Click here to reset