Neuro-algorithmic Policies enable Fast Combinatorial Generalization

02/15/2021
by   Marin Vlastelica, et al.
0

Although model-based and model-free approaches to learning the control of systems have achieved impressive results on standard benchmarks, generalization to task variations is still lacking. Recent results suggest that generalization for standard architectures improves only after obtaining exhaustive amounts of data. We give evidence that generalization capabilities are in many cases bottlenecked by the inability to generalize on the combinatorial aspects of the problem. Furthermore, we show that for a certain subclass of the MDP framework, this can be alleviated by neuro-algorithmic architectures. Many control problems require long-term planning that is hard to solve generically with neural networks alone. We introduce a neuro-algorithmic policy architecture consisting of a neural network and an embedded time-dependent shortest path solver. These policies can be trained end-to-end by blackbox differentiation. We show that this type of architecture generalizes well to unseen variations in the environment already after seeing a few examples.

READ FULL TEXT

page 1

page 6

page 7

research
11/30/2020

Model-based controlled learning of MDP policies with an application to lost-sales inventory control

Recent literature established that neural networks can represent good MD...
research
06/29/2022

Visual Foresight With a Local Dynamics Model

Model-free policy learning has been shown to be capable of learning mani...
research
12/04/2019

Differentiation of Blackbox Combinatorial Solvers

Achieving fusion of deep learning with combinatorial algorithms promises...
research
02/15/2018

MPC-Inspired Neural Network Policies for Sequential Decision Making

In this paper we investigate the use of MPC-inspired neural network poli...
research
11/29/2022

Continuous Neural Algorithmic Planners

Neural algorithmic reasoning studies the problem of learning algorithms ...
research
06/04/2019

Towards Task and Architecture-Independent Generalization Gap Predictors

Can we use deep learning to predict when deep learning works? Our result...
research
10/12/2021

Planning from Pixels in Environments with Combinatorially Hard Search Spaces

The ability to form complex plans based on raw visual input is a litmus ...

Please sign up or login with your details

Forgot password? Click here to reset