Guiding the search in continuous state-action spaces by learning an action sampling distribution from off-target samples

by   Beomjoon Kim, et al.

In robotics, it is essential to be able to plan efficiently in high-dimensional continuous state-action spaces for long horizons. For such complex planning problems, unguided uniform sampling of actions until a path to a goal is found is hopelessly inefficient, and gradient-based approaches often fall short when the optimization manifold of a given problem is not smooth. In this paper we present an approach that guides the search of a state-space planner, such as A*, by learning an action-sampling distribution that can generalize across different instances of a planning problem. The motivation is that, unlike typical learning approaches for planning for continuous action space that estimate a policy, an estimated action sampler is more robust to error since it has a planner to fall back on. We use a Generative Adversarial Network (GAN), and address an important issue: search experience consists of a relatively large number of actions that are not on a solution path and a relatively small number of actions that actually are on a solution path. We introduce a new technique, based on an importance-ratio estimation method, for using samples from a non-target distribution to make GAN learning more data-efficient. We provide theoretical guarantees and empirical evaluation in three challenging continuous robot planning problems to illustrate the effectiveness of our algorithm.


page 1

page 2

page 3

page 4


Focused Model-Learning and Planning for Non-Gaussian Continuous State-Action Systems

We introduce a framework for model learning and planning in stochastic d...

Solving Rearrangement Puzzles using Path Defragmentation in Factored State Spaces

Rearrangement puzzles are variations of rearrangement problems in which ...

Representation, learning, and planning algorithms for geometric task and motion planning

We present a framework for learning to guide geometric task and motion p...

GrASP: Gradient-Based Affordance Selection for Planning

Planning with a learned model is arguably a key component of intelligenc...

Generative Adversarial Network based Heuristics for Sampling-based Path Planning

Sampling-based path planning is a popular methodology for robot path pla...

Search-based Path Planning for a High Dimensional Manipulator in Cluttered Environments Using Optimization-based Primitives

In this work we tackle the path planning problem for a 21-dimensional sn...

Adversarial Plannning

Planning algorithms are used in computational systems to direct autonomo...

Please sign up or login with your details

Forgot password? Click here to reset