## I Introduction

Motion planning is an integral part of many areas of robotics. Robots operating autonomously need to generate many different motion plans in complex environments. This is true especially in the context of task and motion planning [Dantam2018]. Even a single task, such as stacking blocks, might require querying a motion planner thousands of times. Humans can execute motions instantaneously that robots currently struggle with. To achieve human-level behavior, fast online motion planning is essential.

The widespread success of sampling-based planners lies in their ability to approximate the connectivity of high-dimensional spaces with a small number of samples [Choset2005]. However, in many cases regions necessary for connectivity are unlikely to be sampled by an uninformed sampler. This is known as the narrow passages problem [Hsu2003a, Hsu1998, Amato1998] and greatly limits the performance of sampling-based planners in many scenarios.

Among the possible approaches to solving problems that involve narrow passages is the emerging field of experience-based planning [Coleman2015, Ichter2018, Phillips2012]. It is common for robots during their operation to come across similar workspaces resulting in similar motion plans. Such a case can be seen in Fig. 1 where a robot needs to grasp the red can and put it on the top shelf. In this case, prior knowledge about similar scenarios could expedite the motion planning process. By biasing sampling towards interesting regions [Ichter2018] or by retrieving and reusing old solutions [Coleman2015], experience-based methods try to transfer knowledge obtained from similar problems to others. Unfortunately, small changes in the workspace can drastically affect the possible solutions, in several cases, thus making generalization difficult [Kim2017c, Lehner2018].

This paper presents a sampling strategy for sampling-based planners that aids in discovering the connectivity of the configuration space even for pathological cases. This sampling strategy utilizes a decomposition of the workspace into local primitives. The main insight of our method is that learning to generate important samples in the configuration spaces defined by the robot and the primitives helps approximate the configuration space of the global workspace. This approach can generalize to new environments that contain the workspace primitives used earlier or primitives very similar to them. In this work, we focus on problems with simple geometric features, yet manage to solve a class of problems that were practically unsolvable by modern sampling-based motion planners. Also, we raise the question of whether the notion of decomposition applies in unstructured environments with complex geometries, tackling problems that were previously beyond the reach of sampling-based motion planners.

The contributions of this paper are threefold. First, we propose the decomposition of the workspace into local primitives, and we solve motion planning problems in workspaces that contain only the local primitives. The result of this step is the estimation of local samplers that produce samples in the difficult regions of the configuration spaces of the local primitives. The parameters of these local samplers are stored in a database. Second, we show how to synthesize a global sampling strategy based on these local samplers. Third, we show the effectiveness of our approach in two challenging environments where it achieves significant improvement over existing methods.

## Ii Background

Sampling-based planners have been widely adopted in robotics due to their ability to scale to high-dimensional problems. The two main categories are graph-based approaches (e.g, prm [Kavraki1996]) for multi-query problems and tree-based (e.g, rrt [Kuffner2000], est [Hsu]) for single-query problems. Although in general motion planning is pspace-complete [Canny1988, Canny1988b], sampling-algorithms perform remarkably well. However, in challenging environments (e.g., instances with narrow passages), sampling plays an increasingly important role in the planning performance. It is theoretically understood, however, that using non-uniform samplers to generate more samples in low-expansiveness areas [Hsu] can alleviate this problem.

Many approaches use rejection sampling, where the sample is accepted only if it passes a specific geometric test such as Bridge-Sampling [Hsu2003a] or Gaussian-Sampling [Boor1999a]. Other approaches try to guess difficult areas, such as Medial-Axis sampling [Brock2004] or Obstacle-Based Sampling [Amato1998]. Although these approaches ultimately generate samples near or inside narrow passages, they still consider the entire configuration space which is computationally expensive. Nevertheless, the resulting graph is typically much smaller than the one produced by uniform sampling.

Recent approaches similar to our method try to leverage acquired information about the problem, to bias the sampling. Reinforcement learning was used by

[Zucker2008] to infer important areas of the workspace that were transformed to configuration samples. The authors of [Ichter2018] utilize a generative model, called Conditional Variational Auto-Encoder (cvae), that learns to produce samples that lie in “interesting” areas given a workspace description. Sampling-biasing methods have the advantage that they can be used with many sampling-based planners without any modification. However, both of these methods rely on a model that uses workspace features to infer important samples in the configuration space. In [Zucker2008], a discrete workspace cell was mapped to a configuration through the inverse kinematics of the end-effector. In [Ichter2018], a neural network was used to infer these samples. Our experiments showed that in complex configuration spaces these models do not consistently produce samples in the important areas of the configuration space.

Online-adaptive sampling methods use collision checking information to infer which areas of the configuration space are important at runtime. Utility-guided sampling [Burns2005] chooses samples with the maximum information gain based on the entropy of the roadmap. The authors of [Arslan2015] formulate the sampling problem as a classification between free and in-collision samples. Toggle-prm [Denny2013] creates two roadmaps, one in the configuration space and one in the obstacle space, and tries to infer samples in the narrow passages. These methods adapt the sampling online, based on the state the motion planning algorithm, but do not transfer this knowledge between different planning queries. On the contrary, the proposed method biases the sampler based on previous planning queries. Thus, online-adaptive sampling can be used in parallel with the proposed sampling biasing.

Orthogonal to these methods are database approaches. Instead of modifying sampling, these methods leverage previous experiences by storing discrete paths or graphs in a database. This information is later retrieved and repaired/transformed to satisfy the new kinematic constraints. These methods can be thought of as hard-coded experiences compared to biasing the sampling.

The authors of [Berenson2012a] used a library of paths that was queried based on the proximity of start-goal configurations of the entry and the query. If a valid path could not be retrieved based on these heuristics, birrt [Kuffner2000] was used to repair the invalid parts. This idea was expanded in [Coleman2015] where a graph was used to store the paths removing redundancy. Although improving the execution time by orders of magnitude, the mentioned approaches do not adapt well to changes and yield improved results mainly in invariant environments. This happens because they do not explicitly include the workspace in their experience representation.

Database approaches that explicitly use workspace information include [Lien2009] which creates a small database of obstacle road-maps smartly decomposing the configuration space in obstacle-maps. The trajectory prediction proposed by [Jetchev2013] saves the generated paths in task-space and during execution transforms them back to the configuration space by optimizing the cost of the trajectory while using IK. Although these methods can deal with different environments [Lien2009] works only for free-flying robots and [Jetchev2013] works with trajectory optimization planners that lack the probabilistic guarantees of sampling-based planners.

In this work, we combine the best of both worlds by integrating a biased-sampler with a database. This is achieved by decomposing the workspace in local primitives and storing in a database the parameters of an efficient local sampler for each local primitive. The biased-sampler uses prior knowledge in a “soft way” avoiding the hard-commitment to complete paths, induced by databases methods, which may need costly repairing when there are significant changes in the workspace. On the other hand, the database-scheme enables the instant mapping of local samplers to local primitives avoiding the need for a complex parametric model. Additionally, a database has the inherent capability of incrementally improving its experiences by simply adding new entries. The aforementioned sampling-biasing approaches would need to be retrained.

## Iii Method Overview

In this work, we modify the sampling step that produces configuration samples, which is at the core of all sampling-based planners as shown in Algorithm 1. Similar to [Ichter2018] we generate samples from the global sampler (Algorithm 1) and from the uniform sampler (Algorithm 1).

is a hyperparameter that is determined by the application. The data structure G and the update() function (

Algorithm 1) differentiates the sampling-based planners. For example in a prm-like setting, G is a roadmap, and new samples are added to expand the roadmap which captures the connectivity of the C-Space in a variety of ways [Kavraki1996, Denny2013]. In an rrt-like setting, G is a tree, and the sample is used to expand the tree in different ways [Kuffner2000, Denny2013]. However, as stated above, regardless of the planner details, the performance can be improved by using informed sampling.The proposed sampling strategy is based on local primitives of the workspace. The key insight of our approach is that different regions of the workspace often create relatively uncorrelated narrow passages in the configuration space. For example, the top and bottom shelf of a bookcase will create two different challenging regions that the robot arm will need to traverse when moving from one shelf to the other. This means that we can bind an effective local sampling distribution to each shelf (local primitive) dealing with the two problems independently. By combining the local samplers, we can synthesize a global sampler that informatively guides the planners. Our experiments showed that even in highly correlated cases, our approach is still effective.

We will introduce some notation, before describing the main algorithm. Since our method relies on a workspace description, we will denote the workspace as where each is a workspace obstacle. Also we denote the set of local primitives as , such that . Note that the local primitives do not have to be mutually exclusive. For example, pairs of workspace obstacles could be a valid set of local primitives. The global target distribution is denoted as . Local samplers that are used to approximate it are denoted as .

The main steps of the Global-Local Sampler (GL-Sampler) are outlined in Algorithm 2. The first step (Algorithm 2) is to decompose the workspace by identifying its local primitives . In general, this could be a highly sophisticated function; however, in the context of this work, the local primitives are always pairs of workspace obstacles . For each of the local primitives, we try to retrieve their corresponding samplers (Algorithm 2) based on a similarity function (Algorithm 2). If no similar local primitive exist in the database, then the parameters of the local sampler are calculated (Algorithm 2) and stored in the database (Algorithm 2). Note that the database can also be computed offline if the possible local primitives are known before-hand which is often the case. The local samplers are combined to synthesize the global sampler (Algorithm 2). The GL-Sampler is created only at the beginning of the motion planning query.

Fig. 2 shows an illustrating example of the algorithm where it is applied on a fixed-base, 8-link planar manipulator (non-intersecting) in an environment of circular obstacles. The local primitives in this case are pairs of circles and are described as , where denote the position and radius of each circle. In the following sections, the creation of the database, retrieval of local sampler and composition of the global sampler are described.

### Iii-a Creating the Database of Local Samplers

Each local sampler must produce samples that quickly capture the local primitive’s configuration space. For example, each local sampler should produce samples in narrow passages of the corresponding C-Space. First, we generate such samples and later fit the local sampler to them. Such samples are created by solving motion planning queries that likely traverse difficult regions of the configuration space of the local primitives. For every local primitive we pre-specify a set of such motion planning queries. For the local primitives (pairs of circles) in Fig. 2, a path that starts with the robot between the circles and ends with the robot entirely out of them likely traverses a narrow passage. A standard sampling planner e.g., rrt, birrt [Kuffner2000] is used to solve these queries quickly. Additionally, it is imperative that multiple paths for the same query are generated to be robust to obstacles that are part of the global workspace but not the local primitive. This is clear in Fig. 2 where several samples which were valid for the local primitives are invalid for the global problem. To deal with this we run the chosen planner multiple times, creating different paths due to its randomness. Also by using shortcutting techniques [Geraerts2007], most of the redundant samples can be removed to increase the ratio of the useful samples.

Due to the complexity of the needed distribution and its natural multi-modality, we choose a Gaussian Mixture Model (

gmm) as the local sampler similar to [Lehner2018]. However, contrary to [Lehner2018]we do not use the traditional expectation-maximization algorithm to calculate the parameters of the

gmm. There is no good way to choose the number of mixtures, and more importantly, the distance between configurations does not necessarily relate to C-Space connectivity which is what sampling-based algorithms need to capture. Instead, for the local sampler we choose to place one mixture to each produced configuration and use a fixed covariance where is a hyperparameter andis the identity matrix. This might create an unnecessarily large amount of mixtures, but we present a way to reduce them while respecting the connectivity in

section IV-B. The local sampler will be:(1) |

is the number of mixtures. We choose a uniform vector for the weights making all the mixtures equiprobable. In the database, we save the parameters of this distribution and its corresponding local primitive.

### Iii-B Retrieving Local Samplers

To retrieve the local samplers from the database, we need a similarity function to compare the local primitives between them. General workspace descriptors and possible similarity functions have been described by [Jetchev2013]. In our case where the primitives are simple geometric descriptions the negative squared Euclidean distance is used:

We retrieve all the parameters of local samplers that are above a certain threshold of this similarity. Since the local samplers retrieved will not correspond to the exact local primitives a similarity error is introduced.

### Iii-C Synthesizing the Global Sampler

Now we will describe how the local samplers approximate the global sampler. Given an arbitrary partition of the configuration space , ,

, the global sampler can be expressed as a sum of other distributions using the law of total probability:

(2) |

In the last equation we rewrote as and as . Note that the support of is . We approximate it in the following way:

(3) |

Three approximations are used in the derivation above. The first is one is that has its support on instead of . This induces only a small error because is a distribution that has values close to zero outside . The second one is that most of the information in is incorporated in the set of local primitives which is true if the local primitives are responsible for most of the difficult regions of the configuration space. The final approximation is that the local primitives independently affect only one local distribution. . This is not true in general especially in cases where the local primitives are close together. This is the reason why multiple paths are created for each local primitive in section III-A. We refer to this as the decomposition error. In the experiments section, we show empirically that even when this error is large our sampling method is much more effective than uniform random sampling. Finally combining Eq. 1, Eq. 2 and Eq. 3 the global sampler is:

We set which means that the weight of each local sampler is proportional to the number of its mixtures.

## Iv Reducing the Database

Since querying the database happens online, it is crucial to keep its size at a minimum for fast retrieval. We propose two such reductions. One is removing mixtures from the database by merging cliques, and the other is transforming the local experiences to account for multiple local primitives.

### Iv-a Merging Cliques

This step can be executed after the generation of the useful configurations in section III-A. Using the extracted configurations for a specific local primitive we create a graph by connecting two configurations with an edge if there is a collision-free straight line between them. From this graph, we identify the fully connected subgraphs, also known as cliques. The cliques essentially represent groups of configurations that can be accurately approximated with one mixture model. We are not interested in finding the optimal number of cliques which is an np-hard problem, and for that reason, we are using a greedy algorithm that finds them sequentially. After finding the cliques, we calculate the mean and the covariance (if enough samples exist) of each one and use them as a parameter to the gmm as in Eq. 1. Our experiments showed that the time performance was similar when using the merged or the un-merged cliques. An example before merging the cliques can be seen in (a) where the gmm model would have 53 mixtures, after merging (b) shows the gmm would have 23 mixtures which save a lot of space in the database.

### Iv-B Transforming the Local Experiences

To reduce the size of the database, we employ a transformation scheme similar to [Lien2009] that takes advantage of the inherent symmetries of the robot. This allows a local sampler to be transformed to where is a transformed local primitive. Essentially the inherent symmetries of a robot are changes in the configuration of the robot that counter transformations of the local primitives like rotation or translation. Although this is not true in the general case, the majority of robots such as mobile manipulators, drones, and robotic arms have such symmetries. In the kinematic chain scenario Fig. 2, the inherent symmetry is its rotational invariance around its fixed base. This is illustrated in (c) where the grey local primitive is the rotated version of the black local primitive around the fixed base . Applying this simple transformation to the first joint in the means of the gmm the local distribution applies to the new local primitive.

This simple transformation significantly reduces storage requirements. In the studied example the dimensionality of is 6D and the database must store representative local primitives/local sampler from this 6D space. However, by using the mentioned transformation, this space effectively becomes 5D, since we have rotational invariance of the local primitives around the fixed base.

## V Experiments

In the following experiments, we compared different methods for the sampling part of three of the most representative sampling based planners, rrt, birrt [Kuffner2000], and prm [Kavraki1996]. We benchmarked against uniform sampling, and the Conditional Variational Auto-Encoder (cvae) proposed by [Ichter2018]. The cvae method is a neural network that is trained to produce samples that lie on the optimal path given a workspace description. To make the benchmarking fair we trained the cvae using the same dataset that was used for the estimation of the gmms. We used the ompl [sucan2012] benchmarking tools [moll2015] in their default settings, and run on an Intel i7 Linux machine with 4 4GHz cores and 16GB of RAM. Each query was repeated 20 times, and the timeout was set at 200 seconds. In the figures where rrt is not shown it did timeout in all queries. Also, the uniform to biased ratio was chosen to be

, and the variance parameter was chosen to be

.### V-a 8-link Kinematic Chain

Similar to [Zucker2008] we used a fixed-based planer arm in an environment where obstacles are of varying sizes to demonstrate the strength of our approach. The kinematic chain had 8 links of variable lengths from 1 to 2 units and the circles variable radiuses from 1 to 2 units as well. The gap between most of the circles was less than 1 unit making it a very difficult problem. As local primitives, we consider only pairs of circles that are close together. Examples of such local primitives can be seen inside the colored rectangles of Fig. 2. We pre-computed the database such that any local primitive could be queried with a similarity error less than 3. Utilizing the transformation described in section IV-B only 800 pairs of obstacles were needed resulting in a small database and a very fast retrieval time (a few milliseconds). The total time for computing the database was around 10 minutes. We tested our method in three scenarios of scaling difficulty. The start configuration is noted with blue, and the goal configuration is noted with red. Note that we use log-scale for the time axis in all figures except Fig. 6.

#### V-A1 Scenario1

The first scenario Fig. 4, contains only one local primitive thus having a decomposition error of zero and a low similarity error making it ideal for our method which found all the solutions in a fraction of a second. However, the trained cvae did not succeed in approximating an efficient local sampler requiring an order of magnitude more time for the same problems. Finally, it can be seen that both prm and birrt with uniform sampling did not solve any problems within 200s.

#### V-A2 Scenario2

The second scenario Fig. 5, has a large decomposition error due to the proximity of the local primitives. Each local sampler is trained only on a pair of circles, and thus the majority of the samples produced are invalid when other local primitives are nearby. However, it can be seen that our method still significantly outperformed the other ones. Most notably birrt, with our method, found solutions in 1s while the cvae needed around 20s and the Uniform around 200s. Note also that Uniform-prm and cvae-prm timed-out in all cases. This scenario shows that the proposed method is very robust to approximation errors and can potentially work on very complicated environments that are decomposed into circles.

#### V-A3 Scenario3

This scenario Fig. 6, is similar to the one used in [Zucker2008] but much more difficult due to the closeness of the obstacles and the relatively large size of the robot. Both the decomposition error as well as the similarity error are large. In Fig. 2 the different local primitives and their local samplers which were used to create the global sampler are visible. Note that one circle is not part of any local primitive because it is not close to any other circle. Also in this scenario, our method outperformed the others with prm succeeding only when using our method and birrt needing more than 10 minutes to find a solution with uniform sampling.

### V-B 8-DOF Robot

We also experimented on a simulated Fetch Robot [Wise2016] performing an object manipulation task. The robot has a 7-DOF arm and a moving torso resulting in an 8-DOF C-Space. The Fetch Robot tries to place its arm between the cylinders with start and goal shown in (a), which is a difficult problem as it requires reaching into a deep shelf.

To demonstrate the practicality of this approach, we constructed a small database using only the local primitives that were present in the test scene. These local primitives were 3 pairs, each with one of the cylinders and the bookcase. Note that since each local primitive contains only 1 cylinder, it is significantly easier to solve than the full problem. We tested the proposed method on 2 scenarios. The first scenario has the same local primitives that existed in the database thus a similarity error of zero but a high decomposition error. The second scenario has thicker cylinders with a double radius which introduces a similarity error.

We only benchmarked against uniform sampling since there was not a rich dataset to train the cvae. In the results, the increased difficulty of the second scenario is clearly visible. Both planners had zero or low success rate with uniform sampling while our approach succeeded in all of them.

## Vi Conclusions

In this work, we proposed a new sampling-biasing framework that is based on a decomposition of the workspace. We only considered simple geometric primitives, yet we solved problems that were either not possible to solve with the methods we considered or not efficiently solved. Although we consider our results preliminary, we believe that this work paves a new way to apply experience in motion planning problems. Future work could include the use of more complicated primitives that are general and can effectively decompose any workspace. Finally, as the database grows in size efficient real-time retrieving algorithms should be used.

## Acknowledgments

The authors thank B. Willey, J. Hernandez, J. Abella, and M. Moll for their valuable and interesting conversations. The authors especially thank Z. Kingston for the visualization and benchmarking tools that made the Fetch experiment possible.