Log In Sign Up

Robotic Motion Planning using Learned Critical Sources and Local Sampling

Sampling based methods are widely used for robotic motion planning. Traditionally, these samples are drawn from probabilistic ( or deterministic ) distributions to cover the state space uniformly. Despite being probabilistically complete, they fail to find a feasible path in a reasonable amount of time in constrained environments where it is essential to go through narrow passages (bottleneck regions). Current state of the art techniques train a learning model (learner) to predict samples selectively on these bottleneck regions. However, these algorithms depend completely on samples generated by this learner to navigate through the bottleneck regions. As the complexity of the planning problem increases, the amount of data and time required to make this learner robust to fine variations in the structure of the workspace becomes computationally intractable. In this work, we present (1) an efficient and robust method to use a learner to locate the bottleneck regions and (2) two algorithms that use local sampling methods to leverage the location of these bottleneck regions for efficient motion planning while maintaining probabilistic completeness. We test our algorithms on 2 dimensional planning problems and 7 dimensional robotic arm planning, and report significant gains over heuristics as well as learned baselines.


page 1

page 6


Learned Critical Probabilistic Roadmaps for Robotic Motion Planning

Sampling-based motion planning techniques have emerged as an efficient a...

LEGO: Leveraging Experience in Roadmap Generation for Sampling-Based Planning

We consider the problem of leveraging prior experience to generate roadm...

A Levy Flight based Narrow Passage Sampling Method for Probabilistic Roadmap Planners

Sampling based probabilistic roadmap planners (PRM) have been successful...

An Interior Point Method Solving Motion Planning Problems with Narrow Passages

Algorithmic solutions for the motion planning problem have been investig...

A Survey on the Integration of Machine Learning with Sampling-based Motion Planning

Sampling-based methods are widely adopted solutions for robot motion pla...

Fully Persistent Spatial Data Structures for Efficient Queries in Path-Dependent Motion Planning Applications

Motion planning is a ubiquitous problem that is often a bottleneck in ro...

I Introduction

We examine the problem of using prior experience for sampling based motion planning in robots. Sampling based motion planning (SBMP) algorithms construct a graph or roadmap as a discrete representation of the state space of a robot. The vertices of this roadmap represent robot configurations and edges represent potential movements of the robot. A graph search algorithm is then used to find the path between any two vertices in the roadmap. Rapidly Exploring Random Tree (RRT)[17] is a tree growing variant of SBMP that creates this roadmap (tree) implicitly while planning. SBMP is state of the art in high dimensional spaces.

A defining feature of SBMP is it’s reliance on the sampler. Traditionally, these samples are generated either probabilistically or deterministically[8] to uniformly cover the state space. Such a sampling approach allows arbitrarily accurate representations (in the limit of the number of samples approaching infinity), and thus allows theoretical guarantees on completeness. However, in environments where paths pass through narrow passages, these algorithms become computationally intractable. This is because a huge number of samples must be generated to cover these narrow passages.

Fig. 1: A robotic arm planning problem containing a narrow passage.

Thus, the main challenge in sampling based algorithms is to place a small set of samples (critical samples) on certain key locations (bottleneck regions) to enable the algorithm to find a high quality path with low computation effort.

Current state of the art approaches[1, 2] use learning algorithms (called learners hereon) to predict these critical samples. These samples are then connected to the nodes of a uniform sparse graph to construct a roadmap. Depending on the structure of the bottleneck regions (say in an extended narrow passage or semicircular tube like structure), the learner is required to place a small set of critical samples in accordance with the local structure at each bottleneck. The graph created by connecting these critical samples among themselves acts as a bridge for connecting the disjoint sub-graphs (connected components) present in the sparse graph. Thus, the learner needs to not only identify the bottleneck regions but also propose samples in accordance with their local structure.

This brings us to two major challenges faced by these approaches. Firstly, the learners used in these approaches are commonly fed an occupancy grid among other parameters as a representation of the workspace of a robot. There exists a trade-off between the size of the occupancy grid and, amount of data and time taken by the model to converge. In complex planning problems, the size of this occupancy grid must be large. However, this makes the convergence of the model computationally intractable. On the other hand, a decrease in the size of the occupancy grid results in low resolution and thus the learner is not able to learn the internal representation of these bottleneck regions. Thus, a large number of the generated samples are rendered useless. Secondly, most of the learners are conditioned only on the planning problem and not on the prior samples it has generated. Thus, the learner tends to repeatedly sample similar points inside a bottleneck region, leading to redundant samples.

Our key insight is to rely on the learner to identify the location of the bottleneck regions and then exploit the property of planners such as RRT to cover the local structure of these bottleneck regions. Therefore, getting a set of samples from the learner that includes a single sample in each bottleneck region is sufficient. As we expect the learner to generate a single sample per bottleneck region, we propose to convolve the occupancy grid using a kernel to create a smaller grid that contains information relevant for this generation. We use this preprocessed grid during training and for testing the learner.

We thus use the learner to get this initial set of critical samples (one per bottleneck region) and call them critical sources. We then generate our required set of critical samples from these critical sources using local sampling-based methods.

This solves the first challenge faced by current algorithms, as using a smaller grid as input to the learner makes it converge faster to generalizing well over planning problems that have similar global structure for the location of bottleneck regions but different local structures within these regions This also solves the second challenge as we condition the subsequent samples on our current set of critical samples. The job is therefore divided between the learner (generate the critical sources) and the local sampler (procure the required set of critical samples by using these critical sources to generate subsequent samples). We argue that this is similar to how we, as humans, given a planning problem, first globally identify the doorways present relevant to our planning problem and then locally explore how to navigate through these doorways.

Due to the efficient space filling nature of RRTs, we propose the algorithm: Local Critical Source RRT (LCSRRT). This algorithm, in conjunction with a sparse graph, uses RRTs rooted at critical sources as the local sampling algorithm.

We propose another algorithm, Critical Source-RRT (CSRRT), which forfeits the sparse graphs altogether and incrementally builds RRTs rooted at start, goal and critical sources. In addition to RRTs generating the required set of critical samples, CSRRT depends on the inherent bias of RRTs to grow towards large unsearched areas of the problem to emulate the sparse graph once the tree is out of the bottleneck region.

We thus make the following contributions:

  1. We present GenerateCriticalSources, a fast methodology that uses learning models efficiently to generate a set of critical samples, ideally one per bottleneck region (called critical sources).

  2. We present the algorithm Local Critical Source - RRT (LCSRRT) which, in conjunction with a sparse graph, uses RRTs rooted at critical sources to generate the required set of critical samples which acts as a bridge between disjoint sub-graphs of the sparse graph.

  3. We present the algorithm Critical Source - RRT (CSRRT) that works by incrementally building RRTs rooted at multiple key sources (start, goal and critical sources).

  4. We show that LCSRRT and CSRRT outperform the sampling based baselines on a set of point object and robotic arm motion planning problems, and prove their probabilistic completeness.

Ii Related Work

An analysis on the shortcomings of using uniform sampling in the presence of narrow passages is given by D. Hsu et. al. [7] . This has stimulated numerous works on using selective densification for non-uniform sampling [9]-[13].

A multitude of these works use adaptive sampling for roadmap densification by exploiting the structure of the environment. While some propose to sample around the obstacles [14], [15], others use heuristic based strategies to trace or locate key samples [9]. Even though these techniques generalize to a large set of problems, they suffer from placing samples in regions where a path is unlikely to traverse. Also, owing to the huge number of collision checks performed, their computation time increases rapidly with increase in dimensionality of the state space.

Recent approaches use learning models for non-uniform sampling. [1]-[6]. Some of them propose to find low dimensional structure in planning [1], [2]

. In particular, generative models like conditional variational autoencoder (CVAE)

[19] have been used to great success. We use the CVAE used in LEGO [1] (called LEGOCVAE hereon) as our underlying learning model in GenerateCriticalSources and provide a gist of the same below. Note that, although our work uses the model used in LEGO, it can be extended to work with any other learning framework that predicts samples in bottleneck regions [2]-[5].

Ii-a LEGO: Leveraging Experience using Graph Oracles

Leveraging Experience using Graph Oracles is a framework for predicting efficient roadmaps for sampling based motion planning. During training time, LEGO processes a dense graph to identify a sparse subset of key vertices. These vertices are a diverse set of nodes lying on bottleneck regions through which a near optimal path may pass. A CVAE [16] conditioned on the occupancy grid of the workspace, start and goal positions is then trained on these key samples. During test time, given the occupancy grid, start and goal positions, the CVAE generates these key samples which are used in conjunction with a uniform sparse graph to generate a roadmap.

Ii-B CVAE: Conditional Variational Autoencoder

The core component of the LEGO framework is a conditional variational autoencoder (CVAE). It is an extension of traditional variational autoencoder and has been increasingly used to learn low dimensional structure in planning. The addition of conditional parameters helps embed the features of a planning problem as conditions and learn the corresponding representations.

Let be the state space and ,

be the latent random variable and

be the conditioning variable. The framework comprises of two deterministic mappings - an encoder and a decoder. An encoder maps

to a mean and variance value of a Gaussian

in latent space, such that it is “close” to a standard Gaussian . The decoder maps this Gaussian and to a distribution in the output space . This is achieved by maximizing the following function :


At test time, we use only the decoder to map samples from an isotropic Gaussian in the latent space to samples in the output space. The CVAE is trained by passing in a dataset .

is the conditional parameter vector extracted from the planning problem. In our case it’s (start, goal, occupancy grid).

is the desired set of nodes extracted from the dense graph that we want our learner to predict. Hence we train the model by maximizing the following objective.


Iii Problem Formulation

Let denote a dimensional configuration space. Let be the portion in collision and denote the free space. Let a path be a continuous mapping from index to configurations. A path is said to be collision free if for all .

Problem: Given a motion planning problem where is the start configuration and is the goal configuration, find a feasible i.e. that is collision free, and .

Given a database of prior worlds, the overall goal is to use a conjunction of a learned policy and local sampling based planners to generate a roadmap which is used by a graph search algorithm Alg to efficiently compute a feasible path. Alg, given a graph , finds and returns a feasible path . If no feasible path exists in the graph, Alg returns . An ideal roadmap should be sparse enough for Alg to be efficient. In addition, for problems with narrow passages (bottleneck regions), this roadmap must have a set of critical samples in the bottleneck regions to ensure existence of a feasible path.

We assume the graphs to be undirected for simplicity. However, it can easily be extended to directed graphs. The following are the additional notations used in this paper:

  • : A sparse graph embedded in the state space of the robot. It is additionally composed with the set of samples generated from the learner and local planners while constructing by LCSRRT to ensure a minimal coverage.

  • : Set of critical sources.

Iv Approach

We propose to identify the location of bottleneck regions using GenerateCriticalSources and then use LCSRRT or CSRRT to navigate through these bottlenecks using local sampling based planners instead of learning their local structures.

Input :  Planning problem ,
 Sparse graph
Output :  Set of Critical Sources
1 ;
2 ;
3 for sample  do
4       isNear false;
5       for  source  do
6             if  dist(source,sample) ¡  then
7                   isNear true ;
10      if  isNear = false  then
11             free_count 0 ;
12             total_count 0 ;
13             for  v  do
14                   if dist(sample,v) ¡  then
15                         if isValid (edge(sample,v)) then
16                               free_count free_count 1 ;
18                         total_count total_count 1 ;
21            if free_count/total_count ¡  then
22                   sample;
26return ;
Algorithm 1 GenerateCriticalSources

Iv-a Critical Source Generation

GenerateCriticalSources identifies the locations of bottleneck regions by generating , a set of critical samples, one corresponding to each bottleneck region. It uses LEGOCVAE as the underlying learner. If an occupancy grid of a certain size is required to learn the locations of the bottleneck regions along with their local structures, we argue that we can preprocess this occupancy grid with a kernel to create a grid of lower dimensions that encodes only the information necessary to learn the locations of the bottleneck regions. The choice of kernel depends on the features of the environment. For example, a dilation kernel can be used in environments which contain extended narrow passages. As we do not expect LEGOCVAE to learn the local structures, there is no loss of relevant information.

This preprocessed occupancy grid is then used for training and testing LEGOCVAE. We call the LEGOCVAE trained with this preprocessed occupancy grid LEGO-Global as it only learns global information of the planning problem.

We thus use LEGO-Global for generating a set of candidate critical source samples. A sample from this set if (a) it is at least a distance away from all the critical sources generated up till now (Lines 4-8, Algorithm 1) and (b) if we connect the sample to the vertices of that are within distance by edges, the percentage of edges not in collision with the obstacles must be smaller than a certain (Lines 9-17), Algorithm 1).

Input :  Planning problem ,
 Sparse graph
Output :  Path
1 ;
2 {start,goal} ;
3 ;
4 ;
5 ;
6 ;
7 while True do
8       ExpandLocalGraphs();
9       ()DensifyLocalGraphs () ;
10       for  do
11             ;
13       = ;
14       if   then
15             return ;
17       r r ;
Algorithm 2 LCSRRT
Input :  Planning problem , Graph ,
 Local Graphs Set , Trees Set ,
Output :  Local Graphs Set
1 for  do
2       sp ; Nodes of within r distance of
3       for node  do
4             if dist(,node) ¡  then
5                   sp sp {node};
8       Subgraph(,sp);
9       - ConnComp (,) ;
10       for node  do
11             for  v  do
12                   if isValid (edge(node,v)) then
13                         - ConnComp (,node);
14                         {edge(node,v)};
19return ;
Algorithm 3 ExpandLocalGraphs
1 Local Graphs Set , Trees Set ,
 Radius Output :  Local Graphs Set , Trees Set
2 for  do
3       if  is not connected then
4             -ConnComp(,);
5             repeat  until or M iterations
6                   repeat
7                         rn RandomNode (,);
8                         nn NearestVertex (,rn);
9                         rn’ Interpolate (nn, rn, step_size);
11                  until Edge (nn,rn’) is not in collision ;
12                   {rn’,edge(rn’,nn)};
13                   {rn’,edge(rn’,nn)};
14                   for n  do
15                         if isValid(edge(rn’,n)) then
16                               - ConnComp(,n);
17                               {edge(rn’,n)};
23return ;
Algorithm 4 DensifyLocalGraphs

Iv-B Local Critical Source - RRT

Local Critical Source - RRT (LCS-RRT) uses GenerateCriticalSources to get and adds the start and goal vertices to it. It builds , which contains and . For each , the algorithm maintains two graphs: a tree and a local graph . is RRT rooted at . consists of edges and vertices of within distance from . The goal of the algorithm is to make completely connected by adding edges between the vertices of and the vertices of not belonging to the connected component of .

Initially, for each , both and contain only . In an iteration of the outer while loop of LCS-RRT (Line 7, Algorithm 2), ExpandLocalGraphs expands each to radius . After expansion, these consist of the subgraph of within a radius of (Line 6, Algorithm 2a). ExpandLocalGraphs and DensifyLocalGraphs both maintain , a set of vertices of the that does not belong to the connected component of . Wherever possible, the nodes of , are connected to by collision free edges and is updated (Lines 8-12, Algorithm 2a). DensifyLocalGraphs densifies the local graph by growing its and adding edges between its and the vertices of the set until either the

is completely connected or M (hyperparameter) iterations have taken place, whichever is earlier. To grow its

, DensifyLocalGraphs samples a random node within a radius of , interpolates it to within a distance of step_size of the vertex nearest to this node in its and, if possible, joins them with an edge in a similar fashion to RRT growth (and therefore the name) (Lines 5-10 Algorithm 2a). It then tries to connect this new node of with the vertices of and wherever a connection is possible, is updated (Lines 12-15 Algorithm 2b). After densification, is added to and Alg is run on G (Lines 10-12 Algorithm 2). If successful, the feasible is returned. Else, the radius is increased by a factor and the process is repeated again and so on.

Probabilistic Completeness: Let us consider the local graph at start node at infinite time. As time approaches infinity, so does the radius . Thus, the goal node is present in the . As we try to connect to all nodes not present in the connected component of using , we are at the least running a RRT from start node to goal node. Thus, due to the probabilistic completeness of RRT, LCSRRT is probabilistically complete.

Iv-C Critical Source - RRT

Critical Source - RRT (CS-RRT) uses GenerateCriticalSources to get and adds them to . It then adds start and goal vertices to . These nodes subsequently act as the roots of the RRTs we will grow. In each iteration of the outer while loop (Line 3, Algorithm 3), the algorithm iterates through the connected components present in i.e. the RRTs in a round robin manner. In lines 5-10, the algorithm samples a random node, interpolates it to within a distance of of the vertex nearest to this node in its RRT and loops until it is possible to join them with an edge. This is similar to how a RRT grows (and therefore the name). In lines 11-13 it then tries to connect the new_node of its tree to the nearest vertices of the other trees that lie within a distance of . If a connection is possible, an edge is inserted into the graph (line 14). Insertion of this edge results in merger of the two trees the nodes belonged to. The algorithm returns a feasible path when the start and goal nodes belong to the same connected component of (Lines 15-17).

Probabilistic Completeness: As CSRRT grows RRTs rooted at start and goal nodes along with the critical sources to connect the start and goal nodes, it runs a RRT-Connect at the least. Thus, due to the probabilistic completeness of RRT-Connect, CSRRT is probabilistically complete.

Input :  Planning problem ,
 Sparse graph
Output :  Path
1 ;
2 {start,goal} ;
3 while True do
4       for connected component  do
5             repeat
6                   rn RandomNode ;
7                   nn NearestVertex (,rn);
8                   rn’ Interpolate (nn, rn, );
10            until isValid (edge(nn,rn’);
11             {rn’,edge(rn’,nn)};
12             for connected component  do
13                   onn NearestVertex (,rn’);
14                   if distance(onn,rn’) ¡ and isValid (edge(onn,rn’))  then
15                         {edge(rn’,onn)};
16                         if Start and Goal belong to same connected component of  then
17                               = ;
18                               return ;
Algorithm 5 CSRRT

V Experiments

In this section, we compare the performance of LCS-RRT and CS-RRT to sampling based algorithms in multiple domains. We evaluate our algorithms against RRT-Connect[18], a variation of RRT that incrementally builds two trees rooted at the start and goal nodes. Additionally, we compare our algorithms with LEGO, a state-of-the-art learning based sampling algorithm.

Fig. 2: (a) LEGOCVAE fails to cover the local structure of the bottleneck regions even with a large number of samples. (b) LEGO-Global is able and (c) LEGOCVAE is unable to place at least one sample per bottleneck region with a small number of samples. Trained with a preprocessed grid, LEGO-Global is robust to fine changes in local structures.

As LEGO is a graph based approach while others are tree based approaches, we adapted LEGO to an anytime algorithm with incremental densification for testing purposes. We have tuned the parameters of RRT-Connect and LEGO to ensure their best performance.

Evaluation Procedure

For a given planning problem, we run each algorithm with a fixed timeout. For a given problem domain, each of the learning models used by these algorithms are trained on similar number of planning problems. We evaluate the performance of these algorithms on the metric of time taken to find a feasible path.

Problem Domains

We evaluate our algorithms on and problem domains. The problems have random rectilinear walls with extruded narrow passages that have varying local structures (Fig 3). The problem is a robotic-arm manipulation problem in a cluttered environment.

Experiment Details

We compare the algorithms LCSRRT, CSRRT, LEGO and RRT-Connect on a testset of 100 planning problems. For the problems, each learning model is trained on 4000 planning problems for around 30 minutes. The planning timeout for the algorithms is 5 seconds. The size of occupancy grid used by LEGOCVAE is 50X50. The kernel used by LEGO-Global

for preproccessing is of size 5X5 (with stride value 5) resulting in an updated occupancy grid of size 10X10 (Fig 5). For the

problems, each learning model is trained on 4000 training problems for 2 hours and the planning timeout for the algorithms is 12 seconds. The code is open sourced and can be found at

Observation 1. LCSRRT and CSRRT outperform the sampling based baselines LEGO and RRT-Connect.
Fig 4 shows the performance of CSRRT, LCSRRT, LEGO and RRT-Connect on and problem domains. LCSRRT performs better than CSRRT in . However in , where the size of becomes large to ensure coverage of the high dimensional space and collision checking is computationally expensive, CSRRT performs better.
Observation 2. LEGOCVAE is unable to cover the local structure of the bottleneck regions even with a large number of samples. Also, with a few number of samples, LEGO-Global is able to place a sample at each bottleneck region while LEGOCVAE is not (Fig 2).

Fig. 3: (a) Samples predicted by LEGO-Global (green) on the environment. (b) Critical Sources (Pink) selected by GenerateCriticalSources. (c) Path (Red) found by CSRRT (d) Path (Red) found by LCSRRT
Fig. 4: Comparison of CS-RRT, LCS-RRT, RRT-Connect and LEGO for (a) and (b) problem domains
Fig. 5: Convolving with a kernel of size 5X5 on (a) an occupancy grid of size 50 by 50, we are able to extract (b) a 10 by 10 occupancy grid with global features. This approach is similar to the method of dilation used in image processing.

Vi Conclusion

We show the feasibility of using local sampling algorithms aided by a learning model to rapidly find a feasible path in complex environments containing extended bottleneck regions. These algorithms share the responsibility of generating key samples between the learner and the local sampler. This lets the learning model converge faster to generalizing well over planning problems that have similar global structure for the location of bottleneck regions but different local structures within these regions. As we require the learner to only identify the location of bottleneck regions, we introduce the idea of using a kernel to preprocess the occupancy grids for better learning. In future works, we would like to analyse the relationship between a workspace and the kernel suitable to it. We intend to explore the integration of other sampling based methods to make the approach asymptotically optimal. We would also like to test our algorithms on environments where extended bottleneck regions arise due to differential constraints.