Generalized Neural Policies for Relational MDPs

02/18/2020
by   Sankalp Garg, et al.
0

A Relational Markov Decision Process (RMDP) is a first-order representation to express all instances of a single probabilistic planning domain with possibly unbounded number of objects. Early work in RMDPs outputs generalized (instance-independent) first-order policies or value functions as a means to solve all instances of a domain at once. Unfortunately, this line of work met with limited success due to inherent limitations of the representation space used in such policies or value functions. Can neural models provide the missing link by easily representing more complex generalized policies, thus making them effective on all instances of a given domain? We present the first neural approach for solving RMDPs, expressed in the probabilistic planning language of RDDL. Our solution first converts an RDDL instance into a ground DBN. We then extract a graph structure from the DBN. We train a relational neural model that computes an embedding for each node in the graph and also scores each ground action as a function over the first-order action variable and object embeddings on which the action is applied. In essence, this represents a neural generalized policy for the whole domain. Given a new test problem of the same domain, we can compute all node embeddings using trained parameters and score each ground action to choose the best action using a single forward pass without any retraining. Our experiments on nine RDDL domains from IPPC demonstrate that neural generalized policies are significantly better than random and sometimes even more effective than training a state-of-the-art deep reactive policy from scratch.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/26/2018

Transfer of Deep Reactive Policies for MDP Planning

Domain-independent probabilistic planners input an MDP description in a ...
research
09/09/2011

Approximate Policy Iteration with a Policy Language Bias: Solving Relational Markov Decision Processes

We study an approach to policy selection for large relational Markov Dec...
research
02/08/2019

Size Independent Neural Transfer for RDDL Planning

Neural planners for RDDL MDPs produce deep reactive policies in an offli...
research
09/13/2017

Action Schema Networks: Generalised Policies with Deep Learning

In this paper, we introduce the Action Schema Network (ASNet): a neural ...
research
04/21/2022

PG3: Policy-Guided Planning for Generalized Policy Generation

A longstanding objective in classical planning is to synthesize policies...
research
01/30/2018

Features, Projections, and Representation Change for Generalized Planning

Generalized planning is concerned with the characterization and computat...
research
08/04/2019

ASNets: Deep Learning for Generalised Planning

In this paper, we discuss the learning of generalised policies for proba...

Please sign up or login with your details

Forgot password? Click here to reset