Learning-Based Proxy Collision Detection for Robot Motion Planning Applications

02/21/2019 ∙ by Nikhil Das, et al. ∙ University of California, San Diego 0

This paper demonstrates that collision detection-intensive applications such as robotic motion planning may be accelerated by performing collision checks with a machine learning model. We propose Fastron, a learning-based algorithm to model a robot's configuration space to be used as a proxy collision detector in place of standard geometric collision checkers. We demonstrate that leveraging the proxy collision detector results in up to an order of magnitude faster performance in robot simulation and planning than state-of-the-art collision detection libraries. Our results show that Fastron learns a model more than 100 times faster than a competing C-space modeling approach, while also providing theoretical guarantees of learning convergence. Using the OMPL motion planning libraries, we were able to generate initial motion plans across all experiments with varying robot and environment complexities. With Fastron, we can repeatedly perform planning from scratch at a 56 Hz rate, showing its application toward autonomous surgical assistance task in shared environments with human-controlled manipulators. All performance gains were achieved despite using only CPU-based calculations, suggesting further computational gains with a GPU approach that can parallelize tensor algebra. Code is available online.



There are no comments yet.


page 1

page 11

page 12

page 14

page 15

page 16

page 17

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Motion planning, the task of determing a path for a robot from a start to a goal position while avoiding obstacles, is a requirement for almost all applications in which a robot must move in its environment. Motion planning for robots is often performed in its configuration space (C-space), a space in which each element represents a unique configuration of the robot [1]

. The joint space of a robot manipulator is an example C-space, as a set of joint positions may be sufficient to fully define the pose of a robot manipulator in the workspace. The dimensionality of the C-space matches the number of controllable degrees of freedom (DOF) of the robot.

The workspace may include restrictions on the set of feasible robot configurations. Examples include workspace obstacles (such as the three cubes in Fig. 0(a) restricting the 3 DOF arm), self-collisions, or joint limits. For a given workspace, each configuration in the C-space can belong to one of two subspaces: or . A configuration is in if the robot is not in contact with any workspace obstacle when in that corresponding configuration; otherwise, the configuration is in [1, 2]. The robot and obstacles in Fig. 0(a) are represented as the point and amorphous bodies in the C-space representation in Fig. 0(b), respectively. The advantage of planning in C-space is it is easier to plan how to move a point than it is to move bodies with volume [1].

On the other hand, a disadvantage of the concept of C-space is no simple closed-form expression usually exists to perfectly separate from [2, 3, 4]. Determining in which subspace a single configuration belongs typically requires discrete collision detection between the robot and the obstacles [4]. For actual robotic systems, in addition to performing tests for intersection of robot links with workspace obstacles, the collision detection cycle also includes performing forward kinematics to determine where robot geometry would be located and using sensor readings to obtain obstacle information. Repeated queries to a collision checker is computationally expensive, taking up to 90% of computation time for sampling-based motion planning [5]. Apart from motion planning [1, 5]

, applications that may require numerous collision checks include reinforcement learning

[6], robot self-collision analysis [7], and robot simulations [8].

(a) Workspace
(b) Configuration space
Figure 1: Example workspace with a 3 DOF robot and multiple cube obstacles (left) and its corresponding C-space model generated using the Fastron algorithm (right). Each workspace obstacle on the left and its corresponding C-space obstacle on the right match in color. C-space obstacles due to the table are excluded to improve clarity. The robot’s current configuration is represented as the blue point on the right.

I-a Contributions

In this paper, we present a novel approach to collision detection using machine learning techniques and employ this approach to motion planning applications. Building and querying a computationally cheaper model to bypass the collision detection cycle will relieve significant computational burden due to collision checking, allowing faster sampling-based motion planning.

The subspaces of C-space may be modeled using machine learning techniques, allowing the model to serve as a proxy to the collision checking cycle. Moving the obstacles in the workspace will invalidate any C-space model trained on a specific static environment. Our previous work introduced a fast training algorithm, Fastron, that efficiently checks for changes in the C-space due to moving workspace obstacles [9]

. The training procedure of Fastron was inspired by the kernel perceptron algorithm

[10] due its simplicity in implementation and its ability to model training points in an online manner; modifications were made to make training fast and allow efficient adaptation to a dataset with labels that change over time.

The original Fastron algorithm worked with a fixed dataset of robot configurations and relabeled the collision statuses of the configurations when workspace obstacles moved to avoid having to generate new configurations and having to recompute the distance matrix needed for model updates. A disadvantage of retaining a fixed dataset is that a large amount of memory is required to store and its associated distance matrix. The amount of memory required was often much larger than necessary because typically only a small subset of is needed during configuration classification. Furthermore, using a fixed dataset did not allow augmenting with new data, causing the model to be dependent on the original distribution of random samples.

While Fastron originates from our prior work, this paper revamps the algorithm and provides significant theoretical and empirical validation of Fastron for proxy collision detection. In this paper, we address the disadvantages of a fixed dataset by using a dynamically changing dataset to allow resampling and a reduced memory footprint while addressing the scalability of the system to complex environments. We also introduce lazy Gram matrix evaluation and utilize a cheaper kernel to further improve training and classification time. We analytically show our training algorithm will always create a model yielding positive margin for all training points. Finally, we demonstrate the capabilities of Fastron empirically by showing that Fastron allows faster motion planning on a variety of robots and sampling-based motion planners. Results shown in this paper for Fastron show positive results with only CPU-based calculations. Components of this algorithm such as query classification and dataset labeling are easily parallelizable to potentially yield even further improvements.

I-B Related Work

Previous works have utilized machine learning to model the subspaces of C-space or to bypass or accelerate collision checking in motion planning.

Pan et al. [4]

use incremental support vector machines to represent an accurate collision boundary in C-space for a pair of objects and an active learning strategy to iteratively improve the boundary. This method is suitable for changing environments because moving one body relative to another body’s frame is simply represented as translating a point in configuration space. However, a new model is required for each pair of objects, and each model must be trained offline.

Figure 2: Block diagram illustrating the pipeline of the Fastron algorithm. The collision detector assigns true labels for the points in . The model update determines the weights and support set which may be used for proxy collision detection. The active learning strategy augments with a set of unlabeled configurations before the cycle repeats.

Huh et al. [11]

use Gaussian mixture models (GMMs) to represent in-collision and collision-free regions specifically for use in motion planning using Rapidly-Exploring Random Trees (RRTs)

[12]. The GMMs are used both as a proxy collision detector by labeling query points according to Mahalanobis distances from the GMM means and as sampling distributions when finding new nodes for RRT tree expansion. Their results show their GMM-based RRTs can generate motion plans up to 5 times faster than a bidirectional RRT. A downside to this algorithm is the RRT routine must be called repeatedly to generate enough data to achieve consistent proxy collision detector performance, e.g., 3 to 4 RRT calls were required in their algorithm execution time trials. Relying on repeated RRTs for data collection may slow performance, especially when the environment is continuously changing. Furthermore, a downside to GMMs is the number of components in the mixture is fixed, which indicates it may be more difficult to model all possible C-space obstacles for a given robot than it would be when using a nonparametric model.

KNNs have also been used for C-space modeling [3] and proxy collision detection [13], but have only been implemented in static environments. Using KNNs for collision detection in sampling-based motion planning may generate motion plans up to 2 times faster [13]

. Disadvantages of the KNN approach include requiring to store the entire training set and not being able to easily adapt for a changing environment.

Neural networks have been applied to perform collision detection for box-shaped objects and have achieved suitably low enough error to calculate collision response in physics simulations [14]. A disadvantage of the neural network approach is there is typically no formulaic method to determine the optimal set of parameters for neural networks, which in this case required training thousands of networks to find the best-performing network. A significantly large amount of data was required to train and cross-validate the models. Finally, this method has only been tried on box obstacles, suggesting a new network must be trained for other objects.

Qureshi et al. proposes a potentially transformative approach to motion planning and bypasses collision checking during motion planning runtime by directly generating waypoints with MPNet, a pair of neural networks that encodes the workspace and generates feasible motion plans [15]. Motion plans may be generated up to 100 times faster with MPNet than the state-of-the-art BIT* motion planning method. One limitation is the excessive amount of data needed to train MPNet.

The purpose for the Fastron algorithm is to provide a global C-space model that can adapt to a changing workspace. As an online algorithm, all datapoints and all training are achieved during runtime, enabling adaptability to various or changing environments without requiring a large amount of a priori data or requiring multiple instances of models to account for various scenarios. Additionally, Fastron directly adapts its discriminative model to correct misclassified configurations rather than relying on samples to influence generative models, which allows efficient updates of the decision boundary separating from in continuously changing environments.

Ii Methods

In this section, we describe the components of the Fastron algorithm. We begin by describing the binary classification problem before going into the details of how Fastron trains and updates a model. These details also include a proof that this algorithm will eventually always find a model that correctly classifies all points in the training dataset. We finally address some practical aspects of the algorithms to consider when implementing this algorithm. The block diagram in Fig.

2 shows the pipeline of the entire algorithm.

Ii-a Binary Classification

The goal of training a binary classifier is to find a model whose output predicts in which of two classes a query point belongs. The parameters and weights chosen for the model are based on a training set and its associated training labels . In this paper, the elements of are -dimensional robot configurations, e.g., the joint angles defining a robot manipulator’s position. We assume further that these joint angles may be scaled to a subspace of 222We apply joint limits to C-spaces that involve 1-sphere spaces, e.g., for the 3 DOF robot in Fig. 0(a), the base yaw and elbow pitch revolute joints are limited to and the base pitch joint is limited to .. We use the label to denote is in the in-collision class and to denote is in the collision-free class.

Since the labels are , one possible model for prediction may be , where is some hypothesis function. The hypothesis function may be where is a mapping to some (often higher-dimensional) feature space and weight vector contains a weight for each . This hypothesis can be interpreted as a weighted sum comparing the query configuration to each training configuration and is used in discriminative models such as kernel perceptrons [10] and support vector machines [16, 17]. may be written in matrix form for the configurations in as where and is the Gram matrix for where .

The goal of the Fastron learning algorithm is to find such that . Alternatively, the goal is to find such that the margin . Only the configurations in with a nonzero weight are needed to compute a label prediction, and the rest may be discarded after an satisfying the positive margin condition is found. The configurations with nonzero weights comprise the support set of the model. Once the Fastron model is trained, may be used as a proxy to performing collision detection for query configuration .

Ii-B Kernel Function

Figure 3: The rational quadratic kernel ) is an approximation of the more expensive Gaussian kernel . in this plot.

A kernel function compares a configuration to by mapping to some feature space and taking an inner product. should provide a large score for two similar configurations and a low score for dissimilar configurations. The Gram matrix where is thus a similarity matrix and is a useful tool in machine learning techniques, such as support vector machines [16, 17] and kernel perceptrons [10], where similarity between two samples may suggest they share classification labels.

A popular kernel function is the Gaussian kernel, defined as , where is a parameter dictating the width of the kernel. is often considered the default kernel choice for kernel-based machine learning applications when there is limited prior knowledge of the underlying structure of the data [18].

is known to be a positive semidefinite kernel [19], which means is a positive definite matrix when each configuration in is unique [20]. According to Mercer’s theorem, for each positive semidefinite kernel, there exists a mapping where is a Hilbert space such that [21]. Thus, there is no need for explicit mapping to a feature space when utilizing the Gaussian kernel function; computing the inner product without explicitly mapping to a feature space is known as the kernel trick. The kernel trick is useful when learning a classifier that requires richer features than provided in the input space but explicit mapping to a higher-dimensional may be difficult or time-consuming. As the shapes of and are typically complex for robotic applications, this implicit mapping to a richer feature space allows easier separation of the two classes.

While the Gaussian kernel is popular for machine learning applications, exponential evaluations are computationally expensive operations, which can slow down training and classification especially for large . Exponentiation can be avoided by representing the Gaussian kernel with a limit:


For finite values of , the above kernel is the rational quadratic kernel [19], which we represent as . The rational quadratic kernel may alternatively be derived as a superposition of Gaussian kernels of varying kernel widths: , where

is a gamma distribution

[19]. The sum of positive semidefinite kernels is also a positive semidefinite kernel [20]. As a sum of Gaussian kernels, the rational quadratic kernel satisfies the positive semidefiniteness property, showing provides the result of an inner product in a richer Hilbert space via the kernel trick according to Mercer’s theorem and yields positive definite Gram matrices when each configuration in is unique.

Better approximations to the Gaussian kernel are achieved with higher values of as seen in Fig. 3. When is a power of 2, efficient implementation of the rational quadratic kernel may take advantage of tetration. In this paper, we use as the approximation to a Gaussian need not be extremely precise to generate a similarly performing classifier. Using the Eigen C++ library, completely computing a Gram matrix takes 257.0 ms using , while using takes 171.3 ms, illustrating that kernel evaluations can be performed 1.5 times faster.

Ii-C Weight Update

Ii-C1 Derivation of Update Rule

The Fastron algorithm is inspired by the kernel perceptron in the sense that it iteratively adds or adjusts the weights for configurations that are incorrectly classified by the current model [10]. Unlike the perceptron algorithm, the Fastron algorithm prioritizes the configurations in with the most negative margin and makes adjustments to the weights such that the configuration is forced to be correctly classified immediately after the update, which is not a guarantee with the standard perceptron algorithm. These changes result in a model with a large number of weights equal to 0 that generally takes fewer updates to converge to a solution compared to the perceptron algorithm.

Input: Training dataset of configurations , collision status labels
Parameters: Maximum number of iterations , maximum number of support points , conditional bias parameter , kernel parameter for Gram matrix calculations
Output: Updated , support set of configurations
// Get weights, hypothesis, and Gram matrix from previous update
// Back up previous and
3 for  to  do
       // Check for misclassifications
4       if  then
            // Add/adjust support point
7             if  OR  then
11                   continue
       // Back up and
       // Remove redundant support points
14       if  then
15             subject to
18             continue
19      break
// Revert solution if prior was better
20 if  then
// Remove all elements corresponding to non-support points
return ,
Algorithm 1 Fastron Model Updating

By prioritizing the training configuration with the most negative margin, we guarantee that the weight we are updating is for a misclassified configuration. In what follows, parenthetical superscripts denote the training iteration upon which the given value depends. If has the most negative margin of all configurations in on iteration , must be adjusted by such that (thereby ensuring is correctly classified immediately after the weight update):


As when using the rational quadratic kernel, the above condition may be simplified to , which may be enforced by setting . This update rule makes . The weight and training hypothesis vector updates are thus (where is the standard basis vector):


Note that only one element of is updated per iteration, while all elements of are updated per iteration.

Ii-C2 Alternative Derivation of Update Rule

Representing the problem with a loss function provides an alternative method for deriving the update rule and yields interesting insights. Consider the loss function


This loss function can be derived using the method of Lagrange multipliers, where we seek to minimize (equivalent to maximizing the distance of the projections of training points in the feature space) subject to the constraint . The loss function in Eq. 6 is similar to that used in support vector machines [16], but the constraint when minimizing Eq. 6 is now . See the Appendix for the derivation of Eq. 6 for our definition of the hypothesis function .

A quadratic programmer may be used to minimize Eq. 6, but finding the optimal solution is undesired for computational effort and lack of sparsity. To improve the training time and sparsity of the model, Fastron takes a greedy coordinate descent approach and terminates when . Coordinate descent updates one element of per iteration while leaving all other elements fixed [22]. On iteration , coordinate descent minimizes along the axis by setting to the solution of :


Replacing in Eq. 7 with , we realize , matching the result in Eq. 3. It follows that and will be incremented as shown in Eq. 4 and 5.

(a) 2 DOF,
(b) 4 DOF,
Figure 4: TPR and TNR of Fastron for various conditional bias values for the 2 and 4 DOF robot shown in Fig. 5. The ground truth labels are provided using the GJK [23] collision detection method. Increasing the parameter improves TPR at the cost of TNR, but the effects taper off as gets large.
(a) 2 DOF
(b) 4 DOF
Figure 5: Example environments with up to 4 cube obstacles and simple 2 and 4 DOF robots. There are half as many links as there are joints, and each pair of joints is overlapping.

Updating changes as follows:


The coordinate descent direction is selected greedily, i.e., the to update should be selected such that there is maximal decrease in loss. Additionally, since the algorithm will terminate when , the directions considered are restricted to only include directions where . Note that this restriction ensures that if satisfies , then satisfies . Initializing would allow these conditions to be satisified during the updates. Maximizing the decrease in loss given the negative margin restriction shows that greedy coordinate descent selects to update according to:


Eq. 14

shows that the training point with the most negative margin (intuitively, the point farthest from the separating hyperplane on the wrong side) is prioritized when updating. The combination of the update rule in Eq.

7 and the descent direction in Eq. 14 guarantees because whenever there exists a misclassified training point, i.e., . In other words, each iteration guarantees a decrease in loss by at least whenever there are still training points with nonpositive margin.

Claim 1.

Minimization of with the greedy coordinate descent rule defined in Eq. 7 and 14 will always eventually yield a hypothesis with positive margin for all samples given nonsingular Gram matrix .


If , the upper bound on the change in loss per descent step is . A lower bound of is for nonsingular . The margin is exactly 1 for all samples when . A loose upper bound on the number of descent steps required to reach from initial loss is .

If , the hypothesis at iteration successfully provides a positive margin for all samples. ∎

Claim 1 means the weight update algorithm can terminate once all training samples have positive margin or will otherwise work toward achieving positive margin for all samples. As established in Section II-B, is a positive definite matrix if each configuration in is unique (which is the case when sampling from a continuous space), implying is nonsingular in practice. While may be large, the total number of iterations required to satisfy is typically much smaller, as will be shown empirically by the short training times in Section III. In the case that training takes longer than desired, an iteration limit may be defined for early termination (defined as in Algorithm 1) at the cost of yielding a classifier with lower accuracy.

Ii-D Conditional Bias Parameter

As with the standard kernel perceptron, the Fastron model does not have an additive bias term. On the other hand, the kernel SVM model contains a bias term in its hypothesis: . The benefit of including a bias term is that points with the label are more likely to be classified correctly because the model is universally biased toward labeling query points as .

As false negatives (mistaking for ) are more costly than false positives (mistaking for ) in the context of collision detection for motion planning, we would like to bias the model toward the label to err on the side of caution. Rather than trying to learn a bias term, we instead conditionally multiply the target values by a user-selected value during training such that points are more likely to be classified correctly. More specifically, we adjust the rule defined in Eq. 3 to


where (assuming the labels are ) and is a user-selected conditional bias parameter. When , weight updates are larger when correcting for a configuration compared to a configuration. Larger weights for

points would ultimately influence a larger neighborhood in C-space, thereby padding C-space obstacles and potentially increasing true positive rate.

Fig. 4 shows the effect of the conditional bias parameter on true positive rate (TPR) and true negative rate (TNR) (averaged over environments with randomly placed obstacles) for the 2 DOF and 4 DOF robots shown in Fig. 5, respectively. Increasing the parameter improves TPR at the cost of TNR, but the effects taper off as gets large.

Note that an upper bound on the number of iterations still exists when including the conditional bias parameter. As the target values are now instead of , we can consider a modified loss function , where is a diagonal matrix containing the value for each . Noting that for , , and , the new upper bound on iterations to achieve positive margin for all samples is . Once again, in practice, the total number of iterations to satisfy is typically significantly smaller.

Ii-E Redundant Support Point Removal

A support point whose margin would be positive even if it were not in the support set is redundant. Redundant support points should be removed from (by setting its corresponding weight in to 0) to promote the sparsity of the model. Redundant support point removal is useful when the workspace obstacles move, causing the collision statuses of the points in and the decision boundary to change. Removing outdated support points that no longer contribute to the shifted decision boundary is necessary to reduce the model complexity, allowing computational efficiency in changing environments through model sparsity.

To remove from , the weight and training hypothesis vector updates are computed as follows:


The resultant margin at point if it were removed from the support set is . Considering margin to be a measure of how well a point is classified, points are iteratively removed in decreasing order of positive resultant margin until , i.e., removing an additional support point will cause it to be misclassified. We perform redundant support point removal only after a hypothesis that yields positive margin for all training points is found, thus allowing Claim 1 to remain valid.

(a) 2 DOF,
(b) 4 DOF,
Figure 6: TPR, TNR, and model sizes (as a ratio of number of support points to training set size, ) of Fastron for various values for the 2 and 4 DOF robots shown in Fig. 5. The ground truth labels are provided using the GJK [23] collision detection method. These curves motivate the choices in for the 2 DOF and 4 DOF case in which large TPR and TNR are desired while keeping support set sizes as small as possible.

Each redundant support point removal step clearly improves the sparsity of the model by 1. However, following similar steps required to find the change in loss in Eq. 10, the change in loss when removing support point is . This change in loss may be positive or negative depending on the values of , , and , showing that redundant support point removal can potentially step away from the optimal solution. Furthermore, note that redundant support point removal changes the margin at when removing from as follows: . Clearly, the margin decreases for any where , and the number of misclassified training points becomes nonzero if for any . If training points become misclassified when a support point is removed, more weight updates are required to correct these points. In practice, a solution is still found quickly as will be shown in the training speed results in Section III. As a safeguard, if the iteration limit is reached and the solution obtained before support point removal has fewer misclassifications than after removal, the more correct solution will be returned as shown on Line 1 in Algorithm 1.

Input: Support set
Parameters: Number of configurations to add to dataset , maximum number of exploition samples to generate near each support point

, Gaussian variance

Output: Unlabeled set
// Exploitation
2 for  to  do
3       foreach  in  do
4             if  then break
// Exploration
5 for  to  do
Algorithm 2 Fastron Active Learning

Ii-F Active Learning

When the environment changes, the true decision boundary will change. The labels for the training set must be updated using the collision checker prior to updating the model. Rather than resampling the entire space, which may result in many unnecessary collision checks, we can exploit our previous model to search for new information. Active learning is a methodology to query an oracle for additional information and is intended to reduce the amount of labeling needed to update a model [24]. The oracle in this case is the geometry- and kinematics-based collision detection paradigm (KCD).

Our active learning strategy determines a new set of configurations to add to the dataset on which to perform collision checks. Collision checks are to be performed on all previous support points and . As active learning takes place after sparsifying the trained model, the number of collision checks that will be performed for the next model update is always , where the is the user-specified number of configurations to add to .

Fastron’s active learning strategy is conducted in two stages: exploitation and exploration. In the exploitation stage, up to Gaussian distributed random samples are generated near each support point and are added to as shown on Line 2 in Algorithm 2. For simplicity, we sample using isotropic Gaussians with the variance in each direction . Up to unlabeled configurations are added to during the exploitation stage. The idea behind the exploitation stage is to search for small changes in or improvements to C-space obstacles such as when workspace obstacles move an incremental amount or the C-space obstacle boundary may be defined more precisely.

If the number of points in is less than , the exploration stage fills the rest of with uniformly random samples generated in the C-space as shown on Line 2 in Algorithm 2. For certain robots such as manipulators, newly introduced workspace obstacles can cause new C-space obstacles to materialize in difficult-to-predict locations. The purpose of the random exploration stage is thus to search for these new or drastically different C-space obstacles.

Ii-G Practical Considerations

Ii-G1 Algorithm Pipeline

The block diagram in Fig. 2 shows the sequence of steps for the algorithm. Initially, a uniformly random set of unlabeled configurations are generated and the KCD labels each configuration in . The KCD requires a snapshot of the obstacles in the current workspace and knowledge of the robot’s kinematics and geometry. We regard the KCD as a black box, fully encompassing the entire collision detection cycle, including forward kinematics to locate robot geometry, sensor readings to obtain obstacle information, and tests for intersection of the robot’s links with obstacles.

The labeled dataset is used to update the Fastron model. in Algorithm 1 loads the weight vector, hypothesis vector, and partially filled Gram matrix from the previous update, or initializes all elements to if a previous model does not exist. After the support set and the weights are determined, non-support points are discarded and the model is ready to be used for proxy collision detection. in Algorithm 1 removes all elements in the weight vector, hypothesis vector, dataset, and Gram matrix that correspond to non-support points. Active learning augments the previous support set with unlabeled configurations. This new dataset is then fed into the KCD before the cycle repeats.

We expect this entire pipeline to run in parallel with all other processes, and the most recent Fastron model is used for proxy collision detection.

Ii-G2 Joint Limits

This algorithm is intended to be used with robots with joint limits so that there are bounds within which C-space samples may be generated. As we work with isotropic kernels, configurations must be mapped such that the kernels affect the same proportion of each DOF. In this paper, we choose to map the bounded -dimensional joint space to a -dimensional input space . We choose as the input space because if the upper and lower limits for any joint are symmetric about , a joint position of still maps to in input space. If and are vectors containing the upper and lower joint limits, respectively, then the following formula can be applied to map a joint space configuration to an input space point :


where performs element-wise division.

Ii-G3 Support Point Cap

Whenever a weight update increases the number of nonzero elements in , another point is added to the support set . To limit the computation time during classification, a support point limit may be defined to prevent from growing too large.

During the update steps, if the worst margin occurs at , the weight update can proceed without consideration of because will not increase when adjusting the weight of a preexisting support point. On the other hand, if the worst margin occurs at , is added to only if . Otherwise, if , there is an attempt to remove a redundant support point before continuing to train. If no point can be removed, training is terminated.

Ii-G4 Lazy Gram Matrix Evaluation

An advantage of prioritizing the most negative margin rather than sequentially adjusting the weight for each point in is that not all values of are utilized. The Gram matrix may thus be evaluated lazily. For example, if the most negative margin occurs at point , then only the column of is required for the update. Once the column of is computed, it does not need to be recomputed if needs to be adjusted in a later training iteration. in Algorithm 1 computes the column of the Gram matrix if it has not been computed yet.

Ii-G5 Dynamic Data Structures

To avoid training from scratch every time collision status labels are updated, some data structures must be retained and updated throughout the lifetime of the algorithm. The training dataset of configurations and Gram matrix are stored as two-dimensional arrays, while the weight vector , hypothesis vector , and true collision status labels are stored as one-dimensional arrays. The dimensionality of each of these data structures always depends on how many points are currently in .

After training is complete, non-support points are discarded from , and all other data structures are shrunk accordingly. All sparsification happens on Line 1 in Algorithm 1. After active learning, new points are added to , and all data structures must be resized accordingly. If repeated deallocation and reallocation of memory slows performance, a fixed amount of memory can be reserved for each data structure based on .

Typically the values in are determined incrementally. However, elements in corresponding to the new points from active learning cannot be set the same way. Instead, the columns in that have been partially filled through lazy Gram matrix evaluation should first be completely filled. Next, the uninitialized elements of should be calculated directly using the updated Gram matrix and nonzero weights in .

Iii Experimental Results

Iii-a Performance in Static Environment

Iii-A1 Descriptions of Alternative Methods

We compare the performance of decision boundary machine learning techniques for static environments. We compare Fastron with two state-of-the-art, kernel-based algorithms: incremental SVM [25] with active learning [4] and sparse SVM [26], which we refer to as ISVM and SSVM, respectively. As with the Fastron model, we use as the input space and as the labels for both ISVM and SSVM. We train and validate all methods with the GJK algorithm [23], a standard for collision detection for convex polyhedra.

Input: Initial training dataset of configurations , initial collision status labels
Parameters: Kernel parameter (for SVM), regularization parameter

(for SVM), amount to change exploitation/exploration probability threshold

, number of collision checks to perform per active learning update
Output: Weight vector (where first element is model bias), updated dataset
// Train initial SVM model
// Generate up to new samples
3 while  do
       // Active Learning
6       if  then
7             for  to  do
9      else
10             for  to  do
12                         s.t.
       // Update SVM with new information
       // Increase if exploration samples are poorly classified
18       if  AND  then
20      else
      // Exit if all new samples are correctly classified
22       if  then
23             break
Algorithm 3 Incremental SVM with Active Learning Method (Adapted from Original [4])

Pan et al. use ISVM to create an accurate C-space model [4, 27]. Starting by fitting an SVM model to a small set of labeled configurations, Pan et al. iteratively improve upon this initial model using an active learning approach which randomly selects either exploration sampling or exploitation sampling when seeking new points to add to the training set. Exploration requires generating uniformly random samples, while exploitation generates new points between support points of opposite labels. Note that the purpose for Pan et al.’s active learning strategy is to improve the accuracy of the model, while Fastron’s active learning strategy is meant to search for changes in C-space when the workspace changes.

While SSVM has not yet been applied to C-space approximation, we include SSVM for comparison because it attempts to minimize the -norm of the weight vector, i.e., the number of support points. -norm minimization is advantageous because classification time is directly dependent on the number of support points in the model. SSVM approximates the -norm by using a weighted -norm of model weights , , where is a diagonal matrix. The elements of are set to for and to otherwise, where is a small positive value. Including this approximated -norm into the objective function forces many elements of to approach 0. Points whose approach 0 are removed from the training set before repeating the optimization.

Input: Training dataset of configurations , collision status labels
Parameters: Kernel parameter (for SVM), regularization parameter (for SVM), maximum number of iterations
Output: Weight vector (where first element is model bias)
// Initialize weights and Gram matrix
4 for  to  do
       // Perform optimization
5       ,
6       s.t.
       // Obtain new weights
8       ,
       // Exit if no change in model size
9       if  then
10            break
Algorithm 4 Sparse SVM Method (Adapted from Original [26])

Iii-A2 Description of Experiment

For simplicity, in this set of tests we implemented all algorithms in MATLAB and assume that each algorithm’s standing will remain the same when transferred to a compiled language. For the training portion of ISVM, we use Diehl et al.’s MATLAB implementation of the algorithm [17]. Furthermore, note that Pan et al. [4] trains on pairs of objects, which will require multiple models when working with robot arms. As we are interested in comparing the performance of single models, we instead train one incremental SVM model on collision status labels generated for the entire robot arm. For training the SSVM, we use MATLAB’s to solve the more efficient dual problem described by Huang et al. [26]. To further improve the training time of SSVM, we terminate training when the number of support points stops decreasing. In our experience, other than training time, no other metric seemed largely affected compared to the stopping rule provided by Huang et al. (i.e., terminate when the change in is less than ). Our interpretations of the training and active learning of ISVM and training of SSVM are provided in Algorithms 3 and 4, respectively. For Algorithm 3, performs the incremental SVM update according to Diehl et al. [17], generates a random number between 0 and 1, selects a sample from a set, performs geometry- and kinematics-based collision checks, and determines the proportion of misclassifications of a given set.

Ground truth collision checking is performed using the GJK algorithm [23]. We work with cube obstacles and fit the tightest possible bounding box around each link of the robot. Robot kinematics and visualization are performed using Peter Corke’s MATLAB Robotics Toolbox [28].

We begin with a simple 2 DOF robot whose body is a single rod whose yaw and pitch may be controlled. Up to 4 obstacles are randomly placed in the environment. Fig. 4(a) shows an example test environment. We repeat the experiments using a 4 DOF robot, which is created by concatenating two of the 2 DOF robots as shown in Fig. 4(b).

Fig. 6 provides TPR, TNR, and model sizes (metrics demonstrating classification accuracy and speed) for Fastron for various values in randomly generated environments to motivate our choices for the values. We select for the 2 DOF case and for the 4 DOF case as these choices seem to provide an adequate balance between TPR/TNR and model size. A smaller for the 4 DOF case makes sense as a wider kernel is needed to account for the larger C-space. These choices of values also worked well for ISVM and SSVM.

The rational quadratic kernel is used for Fastron and SSVM, and the Gaussian kernel is used for ISVM as Pan et al. use [4]. Both Fastron and SSVM are trained on a uniformly random set of samples in the input space. The incremental SVM uses the active learning approach to build its training set [4]. Each algorithm is ultimately trained on samples for the 2 DOF case, and samples for the 4 DOF case. The 4 DOF case is undersampled because larger dataset sizes become computationally intractable for the comparison methods using our MATLAB implementations.

(a) Workspace
(b) FK + GJK
(c) Fastron
(d) ISVM [17, 4]
(e) SSVM [26]
Figure 7: Example static environments and C-space approximations using various methods for a simple 2 DOF robot.

Iii-A3 Description of Metrics

As the sizes of and

may be unbalanced depending on the locations of workspace obstacles, overall accuracy may be skewed when one of the classes dominates the C-space. Thus, we also include within class accuracy, i.e., TPR and TNR. TPR measures the proportion of in-collision samples correctly classified, and TNR measures the proportion of collision-free samples correctly classified.

As measures of classification performance, we measure the average time required to perform a proxy collision check and the size of the model in terms of the number of support points . As proxy checks using each of these models requires a weighted sum of kernel evaluations, proxy check times would be directly related to the number of support points and computational complexity of the type of kernel. As labeling a given configuration when using GJK first requires forward kinematics (FK) to locate where the links of the arm would be, FK is included in the query timing for GJK.

Method Query Time Training Time
Fastron ns ms
ISVM [17, 4] s s
SSVM [26] ns s
FK + GJK s
(a) 2 DOF
Method Query Time Training Time
Fastron s ms
ISVM [17, 4] s s
SSVM [26] s s
FK + GJK s
(b) 4 DOF
Figure 8: Performance of Fastron in a static environment for the 2 DOF and 4 DOF robots shown in Fig. 5 compared against incremental SVM (ISVM) [17, 4], sparse SVM (SSVM) [26], and the ground truth collision detection method GJK [23]. Lower is better for model size , query time, and model training time. The best results are shown in bold.

Finally, even though we are working with static environments, we compute training time to determine how long it takes for each algorithm to generate a model. We include training time to assess which algorithms would be suitable for changing environments. This training time includes both the time required to learn the weights for the model and the time required to label each point in the training set using a collision checker.

Iii-A4 Results for 2 DOF Robot

Fig. 8(a) and the table in Fig. 7(a) provide comparisons of three machine learning methods trained on an environment with up to 4 obstacles for a 2 DOF case. All methods have comparable accuracy, TPR, and TNR. All methods performed significantly faster than GJK, which took on average s per collision check. Of the machine learning based methods, SSVM had the sparsest solution, which results in it providing the fastest proxy collision checks. Furthermore, ISVM has on average more than 6 times the number of support points probably because more support points are placed near the boundary after its active learning procedure. Due to its larger number of support points and slower kernel, ISVM has the slowest proxy collision check timings.

SSVM takes nearly 100 times longer than Fastron to train, the fastest among all tested methods. Profiling the code suggests the slowest parts of SSVM and ISVM is having to repeatedly solve an optimization problem. Fastron’s speed in training compared to the SVM methods is due to not having to optimize an objective function completely and not requiring the entire Gram matrix.

Visualization of the C-space is straightforward when working with a 2 DOF robot and allows qualitative comparison. Figs. 6(b) shows example ground truth C-spaces (with axes scaled to the input space) for the workspaces shown in Figs. 6(a) along with approximations provided by the three kernel-based methods in Fig. 6(c)-6(e). Examining the placement of support points, it is apparent that support points are typically placed closest to the boundary for ISVM except when some C-space obstacles are large. On the other hand, support point placement is much farther from the boundary for SSVM.

(a) 2 DOF
(b) 4 DOF
Figure 9: Accuracy, TPR, and TNR (higher is better) for Fastron, ISVM [25] with active learning [4], and SSVM [26] in a static environment using the 2 DOF and 4 DOF robots shown in Fig. 5. GJK [23] is used as the ground truth collision detection method.
Figure 10: Baxter robot with cube obstacles used for collision detection and motion planning experiments.
Method Query Time Update Time
Fastron (FCL) s ms
Fastron (GJK) s ms
FK + GJK [23] s
FK + FCL [29] s
(a) 4 DOF
Method Query Time Update Time
Fastron (FCL) s ms
Fastron (GJK) s ms
FK + GJK [23] s
FK + FCL [29] s
(b) 6 DOF
Method Query Time Update Time
Fastron (FCL) s ms
Fastron (GJK) s ms
FK + GJK [23] s
FK + FCL [29] s
(c) 7 DOF
Figure 11: Performance in a changing environment when actuating the first 4 DOF, the first 6 DOF, and all 7 DOF of the Baxter robot’s right arm, respectively, shown in Fig. 10. Lower is better for model size , query time, and model update time. The best results are shown in bold.
Method Query Time Update Time
Fastron (FCL) s ms
Fastron (GJK) s ms
FK + GJK [23] s
FK + FCL [29] s
Figure 12: Performance in a changing environment with three moving obstacles when actuating all 7 DOF of the Baxter robot’s right arm, respectively, shown in Fig. 10. Lower is better for model size , query time, and model update time. The best results are shown in bold.
(a) 4 DOF
(b) 6 DOF
(c) 7 DOF
Figure 13: Accuracy, TPR, and TNR (higher is better) for Fastron and GJK [23] in a continuously changing environment using the Baxter’s right arm and one workspace obstacle as shown in Fig. 10. Fastron is trained on both FCL [29] and GJK. FCL is used as the ground truth collision detection method. The three cases correspond to actuating the first 4 DOF, the first 6 DOF, and all 7 DOF of the arm, respectively. Note that the GJK results do not change significantly because the entire arm is always used for geometry-based collision detection while the dimensionality of the space that Fastron models equals the number of DOF.

Iii-A5 Results for 4 DOF Robot

Fig. 8(b) and the table in Fig. 7(b) provide a comparison of the three machine learning methods trained on an environment with up to 4 obstacles. ISVM and SSVM have comparable accuracy, TPR, and TNR. On the other hand, Fastron has slightly less accuracy than the other methods, but has significantly more TPR and less TNR. Higher TPR is advantageous for collision detection where a more conservative prediction is preferred. We anticipate all methods would improve in terms of accuracy, TPR, and TNR if given more training data.

As with the 2 DOF case, all methods performed significantly faster than GJK, which took on average s per collision check. SSVM once again had the sparsest solution, requiring half as many support points as the other methods. SSVM’s sparsest solution also makes it provide the fastest predictions. Fastron and ISVM require approximately the same number of support points, but Fastron classifies faster due to its cheaper kernel.

SSVM takes the longest to train, while Fastron trains at least 200 times faster. Once again, Fastron’s training speed is due to not having to optimize an objective function completely and not requiring the entire Gram matrix.

Iii-B Performance in Changing Environment

Iii-B1 Description of Experiment and Metrics

We test the Fastron algorithm in changing environments using C++. As Fastron is the only learning-based method that globally models C-space for changing environments to our knowledge, we do not include comparisons to other machine learning methods. Furthermore, the training timings for ISVM and SSVM from the static case suggest that our implementations of these methods are not suited for changing environments. Instead, we compare Fastron to two collision detection methods: GJK and the Flexible Collision Library (FCL) [29], the default collision library for the MoveIt! motion planning framework. We use FCL as the ground truth labels for these tests, and use GJK [23] as an approximate geometry method. We train Fastron on labels from FCL and GJK and include the results for both variations.

The experiments involve moving boxes around the reachable workspace of a Baxter robot’s right arm. For GJK, we use cylinders as approximations for the geometry of each link in the arm. Robot kinematics and visualization are performed using MoveIt! [30] and rviz [31] in ROS [32]. We anticipate that having only one workspace obstacle (such as the workspace shown on the left in Fig. 10) would be the most challenging case for Fastron due to a misbalance of class sizes in C-space. On the other hand, one workspace obstacle would be the easiest case for geometry- and kinematics-based collision detectors whose performance is dependent on the number of objects in the workspace. Thus, most analysis in the following experiments involve one workspace obstacle, but results from multiple obstacle workspaces (such as that shown on the right in Fig. 10) are also included.

For the experiments with one obstacle, we perform three versions of the tests: using the first 4 DOF of the arm (excluding all wrist motions), the first 6 DOF (excluding gripper rotation), and all 7 DOF. For the 4 DOF and 6 DOF experiments, we perform full collision checking on the arm, but leave the unactuated joints’ positions fixed at 0. We only consider the 7 DOF version when using workspaces with multiple obstacles.

We used a coarse grid search to select the parameters for Fastron. We use for the 4 DOF case and for the higher DOF cases. The conditional bias parameter is for the 4 DOF case and for the higher DOF cases. Our active learning strategy found and labeled 500 additional points for each case.

As with the static case, we compute accuracy, TPR, and TNR to measure the correctness against FCL of each approximation method. Additionally, we include model size for Fastron and the query timing of all methods. FK is included in the query timing for both GJK and FCL. Finally, we include the time required to update the Fastron model given the previously trained model.

Iii-B2 Results for 4, 6, and 7 DOF Manipulators

Fig. 13 and the tables in Fig. 11 provide comparisons of Fastron (trained on FCL and GJK) to the geometry-based collision detectors for the 4, 6, and 7 DOF cases, respectively, when using one workspace obstacle. In all cases, GJK had the highest accuracy and TNR. The TPR of Fastron trained on FCL almost meets that of GJK, but only beats GJK for the 4 DOF case.

Comparing Fastron trained on FCL to Fastron trained on GJK, we can see a noticeable improvement in update time. The update rate increases from around 24 to 63 Hz for the 4 DOF case, from 16 to 41 Hz for the 6 DOF case, and from 13 to 30 Hz for the 7 DOF case. The reason for the drastic decrease in update time is obvious: as a large portion of time to update the model is spent on performing collision checks, using a cheaper collision detector for labeling would decrease the update time. On the other hand, we also see that the TPR of Fastron (GJK) is lower compared to that of Fastron (FCL), probably because training on the approximate collision detection method causes some missed detections that would only be detected with a more precise detector. These missed detections may also explain why Fastron (GJK) requires fewer support points on average than Fastron (FCL).

As the degrees of freedom increase, we notice the accuracies of Fastron decrease while model sizes, query times, and update times increase. This is due to the fact that higher dimensional spaces requires more data to correctly model. GJK and FCL are unaffected by the change in DOF because we always need to use the entire arm for collision detections.

The table in Fig. 12 provides model sizes, query times, and update times for the full 7 DOF arm when there are three workspace obstacles. Comparing the table in Fig. 12 to that in Fig. 10(c), we can see that Fastron query times increase marginally when the number of workspace obstacles increase because the number of support points required to model the corresponding C-space is higher. We can see that GJK suffers when the number of workspace obstacles increase while FCL does not significantly change. Consequently, while update times increase for both versions of Fastron with more workspace obstacles, the update times increase more for the GJK-based Fastron than for the FCL-based Fastron.

Figure 14: Accuracy, TPR, and TNR (higher is better) for Fastron (trained on FCL [29]), Fastron (trained on GJK), and GJK [23] in a continuously changing environment using one of the Baxter’s 7 DOF arms and three workspace obstacles as shown in Fig. 10. FCL is used for ground truth.
Figure 15: Query times for Fastron, GJK [23], and FCL [29]

against the number of workspace obstacles (bottom axis) and the estimated proportion that

occupies of the C-space (top axis). GJK’s and FCL’s query times increase linearly with the number of workspace obstacles, while Fastron’s query times increase before decreasing around where occupies of the C-space.
Figure 16: Timings for motion planning for the Baxter robot’s right arm with one workspace obstacle using RRT, RRT-Connect, RRT*, and BIT* using Fastron, GJK, and FCL for collision detection. The three rows correspond to planning using only the first 4 DOF, only the first 6 DOF, and all 7 DOF of the arm, respectively. Forward kinematics (FK) are included in the timings for the geometry- and kinematics-based collision detectors.
Figure 17: Timings for motion planning for the Baxter robot’s 7 DOF right arm with three workspace obstacles using RRT, RRT-Connect, RRT*, and BIT* using Fastron, GJK, and FCL for collision detection. Forward kinematics (FK) are included in the timings for the geometry- and kinematics-based collision detectors.

Fig. 15 shows the query times for Fastron, GJK, and FCL with respect to the number of workspace obstacles for up to 100 obstacles. The proportion of samples (estimated using the average proportion of test samples with the label) is also included as an explanatory variable. As geometry-based methods, GJK’s and FCL’s query times both increase as the number of obstacles increase (though the increase in timing for FCL is much less significant), which makes sense as the number of comparisons required increases with more obstacles in the workspace. On the other hand, as a kernel-based method, Fastron’s query times increase before decreasing with respect to the number of obstacles. For larger numbers of obstacles, Fastron provides collision status results up to an order of magnitude faster than FCL and almost 20 times faster than GJK. The maximum Fastron query time occurs when occupies approximately of the C-space. This result makes intuitive sense because fewer support points are required when one class is significantly more prevalent than the other, but more support points are required when both classes are roughly equally present.

Iii-C Performance in Motion Planning Application

Iii-C1 Description of Experiment

As motion planning is one application that requires frequent collision checks, we apply Fastron to motion planning for one of the Baxter arms to see the effect on the computation time required to generate a feasible plan. We use the same environments methods as used in Section III-B. As proxy collision detection timings were roughly the same for both Fastron (FCL) and Fastron (GJK), we only provide results for Fastron (FCL) in this section.

The experiments involve using the various collision detection methods in standard motion planners. Each trial involved incrementally moving the obstacles in the robot’s reachable workspace and generating a new motion plan (using each collision detection method) from scratch each time the obstacle is in a new position. The start and goal configurations are randomly generated such that the arm must move from one side of the workspace to the other, and are only regenerated after a motion plan has been generated using each collision detection method. The OMPL library [33] is used to handle the motion planning, and we fill in the state validity checking routine with one of the three collision checking methods. We select RRT [12], RRT-Connect [34], RRT* [35], and BIT* [36] to demonstrate the performance of each collision detection method. RRT and its bidirectional variant RRT-Connect are probabilistically complete motion planners [1] that terminate once a path is found. RRT* and BIT* are optimizing planners that usually continue to search C-space for a shorter path after an initial feasible path has been found. As we are interested more in generating feasible plans than optimal plans, we terminate RRT* and BIT* once a feasible plan is found, which means the resulting path may not be close to optimal. We include these results to show how quickly the initial path can be found upon which RRT* and BIT* may improve.

Since Fastron and GJK are approximations to true collision detection, we anticipate that the plans that are generated with these approximate methods may actually include some