MPC-Inspired Neural Network Policies for Sequential Decision Making

02/15/2018
by   Marcus Pereira, et al.
0

In this paper we investigate the use of MPC-inspired neural network policies for sequential decision making. We introduce an extension to the DAgger algorithm for training such policies and show how they have improved training performance and generalization capabilities. We take advantage of this extension to show scalable and efficient training of complex planning policy architectures in continuous state and action spaces. We provide an extensive comparison of neural network policies by considering feed forward policies, recurrent policies, and recurrent policies with planning structure inspired by the Path Integral control framework. Our results suggest that MPC-type recurrent policies have better robustness to disturbances and modeling error.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/20/2020

Learning High-Level Policies for Model Predictive Control

The combination of policy search and deep neural networks holds the prom...
research
03/03/2020

MPC-guided Imitation Learning of Neural Network Policies for the Artificial Pancreas

Even though model predictive control (MPC) is currently the main algorit...
research
11/15/2020

Stein Variational Model Predictive Control

Decision making under uncertainty is critical to real-world, autonomous ...
research
05/23/2017

Thinking Fast and Slow with Deep Learning and Tree Search

Sequential decision making problems, such as structured prediction, robo...
research
02/15/2021

Neuro-algorithmic Policies enable Fast Combinatorial Generalization

Although model-based and model-free approaches to learning the control o...
research
07/03/2019

Co-training for Policy Learning

We study the problem of learning sequential decision-making policies in ...
research
06/10/2015

Data Generation as Sequential Decision Making

We connect a broad class of generative models through their shared relia...

Please sign up or login with your details

Forgot password? Click here to reset