I Introduction
Robotic manipulation tasks in the presence of noise inevitably suffer from uncertainties and collision risks. One representative example is a manipulator mounted on an underwater vehicle, which faces not only the disturbances from currents and inner waves, but also the base movements caused by the interaction between manipulators and the vehicles on which they are mounted. Due to limited battery life, such tasks usually need to balance the risk of collisions and the costs associated with the trajectory. Another typical scenario is motion planning for human support robots, which are often surrounded by elder people and children and have to be very careful about collision avoidance. In the aforementioned applications, it is insufficient to simply rely on feedback controllers because deviations remain despite their application and there are no guarantees for task success. Therefore, for those uncertaintysensitive planning tasks where safety and accuracy are crucial, it is important that the motion planner can reason over uncertainties during the planning phase, and react intelligently according to different planning scenarios and constraints over the probability of plan failure, i.e. chance constraints [1].
However, most existing riskaware planners are limited to applications with lowDOF robots or simplified environments with convex obstacles, meanwhile available approaches are lacking for realtime highdimensional planning under uncertainties. In this paper we propose Probabilistic Chekov (pChekov) to fill in this gap. PChekov is derived from the previously introduced deterministic Chekov planner [2] and inherits its realtime planning superiority for highDOF robots. PChekov innovatively applies quadraturebased sampling for collision probability estimation, which helps obtain better estimations with a limited number of samples. It also takes advantage of the concept of risk allocation [3], where an allowed probability of failure is divided among individual constraints, to help with conflict extraction and expedite the the search for feasible solutions. Based on the estimated noise models, pChekov can generate feasible trajectories for highDOF robots that satisfy a userspecified chance constraint over collision failures, which is the practical need of many realworld highdimensional planning tasks that operate in unstructured, rapidlychanging environments.
Ii Related Work
Many existing riskaware motion planners are based on Markov Decision Processes (MDPs)
[4, 5, 6] or Partially Observable MDPs (POMDPs) [7, 8]. Despite the wide application of MDPbased approaches, many of them require discretization of the state space. Even for extensions that can handle continuous planning domains, tractability is still a common issue since they typically need partitioning or approximation of the continuous state space [9]. Another class of probabilistic planners formulates the motion planning problem into an optimization problem, for example using Disjunctive Linear Program (DLP)
[10, 11]. The probabilistic Sulu planner (pSulu) in [9] performs goaldirected planning in a continuous domain. However, since pSulu encodes feasible regions with linear constraint approximations, it inevitably suffers from the exponential growth of computation complexity when applied in complicated 3dimensional environments or tasks with multiple agents.Riskaware extensions of samplingbased planners are also popular in the motion planning field [12, 13, 14]. However, their applications are often limited to carlike robots in simplified environments due to their disadvantages in planning speed [2] and collision probability estimation ability for highDOF robots in realworld complex environments. When the robot has high dimensionality, the collision checking happens in the 3D workspace whereas the planning happens in the highdimensional configuration space. Mapping the free workspace into the configuration space is nontrivial, which hence becomes another barrier for highdimensional riskaware motion planning. The methods introduced in [15] and [16]
take advantage of elliptical level sets of Gaussian distributions and the transformation of the environment to estimate waypoint collision probabilities under Gaussian noises. Nevertheless, these methods can not be trivially extended to highDOF applications due to the difficulty of mapping between the workspace and the configuration space.
In order to address these difficulties in highDOF robot motion planning tasks, the pChekov introduced in this paper combines the ideas from the Chekov “roadmap + TrajOpt” planner [2], LinearQuadratic Gaussian motion planning (LQGMP) [15], quadrature theories [17], and risk allocation [3, 1]. PChekov improves upon the isolated Chekov by extracting conflicts from the planning failures in order to guide TrajOpt [18] to find better solutions. It applies the LQGMP approach to estimate the state probability distributions, but differs from LQGMP in that it aims at generating feasible trajectories in realtime that satisfy a specified risk bound and meet a local optimality criterion, instead of choosing the minimum risk trajectory among candidate trajectories. PChekov relies on a quadraturebased sampling method to estimate the collision probability for individual waypoints, which mitigates the inaccuracy of random sampling and avoids the difficulty of mapping between configuration space and workspace.
Iii Problem Statement
PChekov solves the robot motion planning problems under uncertainty with a guaranteed success probability, considering temporal, spatial and dynamical constraints. Under process and observation noises, the collision rate during plan executions should not exceed a user specified chance constraint. Its realtime planning feature is key to providing robots the capability to operate effectively in unstructured, uncertain, fastchanging environments.
Let denote the robot state space and denote the control input space. Consider a discretetime system with a fixed time interval corrupted by Gaussian process noises and observation noises :
(1)  
where and are the robot state and control input at time step . The initial state .
A nominal trajectory is defined as a sequence of robot states and control inputs that satisfy the deterministic dynamics model for , where the number of time steps is a finite integer. An objective will be specified for each planning task. The goal state should fall in a convex goal region . A valid trajectory should satisfy the temporal constraint , where denotes the allowed execution time, and the collision chance constraint:
(2) 
where constrains the trajectory from colliding into each obstacle and is the user specified allowed probability of failure. Then the problem solved by pChekov can be expressed as:
Problem 1.
(3)  
subject to  
In order to guarantee the performance of this approach, we make two key assumptions. First, we assume that process and observation noises are both Gaussian distributed, and the noises on different state components are independent from each other. In real world scenarios, uncertainty due to identical and independent noises is accumulated over time, thus based on the Central Limit Theorem
[19], Gaussian models should be applied. Second, we assume that both the system dynamics and observation models are either linear or can be well approximated locally by their linearizations. In realworld planning tasks, robot motions will be controlled to closely follow the planned trajectory during execution, thus the local linearizations of the nonlinear dynamics are good approximations for robot motions.Iv Approach: Probabilistic Chekov
Figure 1 provides pChekov’s system diagram, which includes a planning phase and an execution phase. The goal in the planning phase is to find an initial feasible trajectory that satisfies the given joint chance constraint . This initial trajectory is not guaranteed to be optimal, thus pChekov will improve it in an anytime manner during the execution phase in order to achieve better utility. However, this paper focuses on the planning phase algorithm, and detailed explanations of the execution phase is beyond the scope of this paper.
PChekov first uniformly distributes the chance constraint into the allowed collision risks for each waypoint along the trajectory. Using the “roadmap + TrajOpt” planner described in
[2], it then generates a nominal trajectory that is feasible and collisionfree under deterministic dynamics. Given the model of controller and sensor noises, it estimates the a priori probability distribution of robot states along this nominal trajectory through the LQGMP approach [15]. The two assumptions we made in Section III ensure that the estimated states from LQGMP are Gaussian distributed around the nominal trajectory. The probability of collision on each waypoint can then be estimated using the quadraturebased approach in Section IVA. After that, we can compare the allocated risk bound and the estimated risk of collision for each waypoint, shown as the “risk test” in Figure 1. If all the risk bounds are satisfied, pChekov will execute this nominal trajectory. Otherwise, new constraints will be added to TrajOpt at the waypoints where the estimated risk of collision exceeds the allocated risk bound, and plan generation will be performed again. The added constraints include an increase on the penalty hitin distance (the minimum distance between objects where collision penalty starts to be nonzero [20]) and penalties on the configurations at those waypoints, so that the plan generator is guided to avoid those conflicts in future iterations. In addition, the risk allocation will also be adjusted based on the approach that will be illustrated in Section IVB, so that the number of iterations it takes to find feasible trajectories could be reduced. This cycle will keep going until the solution trajectory satisfies the chance constraints or the iteration number hits its upper bound.Iva Collision Probability Estimation
Given the state probability distribution around a nominal configuration, the collision probability can be approximated by sampling from this distribution and checking the percentage of configurations that are in collision. However, as with all Monte Carlo methods, this collision probability estimation approach would suffer from inaccuracy when the sample size is small and high computational cost when the sample size is large. Therefore, a method of intelligently finding the samples that can closely approximate the collision probability with only a small number of them is very important.
This collision probability estimation is essentially estimating the expectation of a collision function:
(4) 
along the distribution , which is estimated using the LQGMP approach [15]
. Since expectations can be written as integrals, nonrandom numerical integration methods (also called quadratures) can be applied to this problem. Denote the probability density function of
as , then the expectation of can be expressed as:(5) 
Assume is dimensional and let denote its th component whose distributions are independent from each other, based on the independent noise assumption in Section III. Since is Gaussian distributed, we can write
. Then, based on the conditional distribution rule of multivariate normal distribution
[21], we have:(6)  
(7) 
we can write the expectation of as:
(8) 
Let , then:
(9)  
GaussHermite quadrature approximates the value of integrals by calculating the weighted sum of the integrand function at a finite number of reference points, i.e.
(10) 
where is the number of sampled points, are the roots of the Hermite polynomial and the associated weights are given by [22]:
(11) 
in its form in Equation 9 still doesn’t correspond to the Hermite polynomial, therefore we conduct the following variable change:
(12) 
(13) 
Hence the value of can then be approximated through GaussHermite quadrature rule:
(14) 
where are the Hermite polynomial roots for integrating the component, are the associated weights, and is the number of sampled points.
If we iteratively conduct this procedure from to , we will get an estimation of the collision probability through:
(15)  
IvB Risk Reallocation
Ono and Williams [3] introduced the concept of risk allocation, which decomposes a joint chance constraint by allocating risk bounds to individual constraints, where . The problem is then separated into a risk allocation optimization stage and a control sequence optimization stage. Inspired by the concept of risk allocation and bistage planning, we developed a risk reallocation approach that can reduce the iterations to get feasible solutions and produce less conservative trajectories, as shown in Algorithm 1.
This reallocation relies on the classification of different constraints. Denote the estimated collision risk at waypoint as . When exceeds , we define the chance constraint at the th waypoint as a violated constraint, otherwise it is viewed as satisfied. Satisfied constraints are divided into active constraints and inactive constraints by introducing a risk tolerance parameter . If the difference between and is larger than the risk tolerance, we view this chance constraint as inactive. Otherwise, the constraint is viewed as active. The key idea of this risk reallocation method is to take risk from inactive constraints and give it to those violated constraints. This is different from the Iterative Risk Allocation (IRA) algorithm introduced in [3] which iteratively reallocates risk from inactive constraints to active constraints. IRA requires a trajectory where all the individual chance constraints are satisfied, but it doesn’t help with finding an initial satisfying trajectory. Thus it is only applicable to pChekov’s execution phase but not the planning phase.
IvC Probabilistic Chekov
Algorithm 2 briefly summarizes the pChekov planning phase algorithm. The main difference between pChekov and other existing riskaware motion planners relies on the usage of the “roadmap + TrajOpt” planner (line 1 and 6 in Algorithm 2). This deterministic planner has low planning time requirements for highDOF robots, and can straightforwardly incorporate differential constraints from robot dynamics [2]. In each iteration, pChekov determines TrajOpt’s penalty hitin distances for different waypoints () and the configurations to be penalized () based on the previous conflicts. With this “roadmap + TrajOpt” planner as its core, pChekov uses a LQGbased state estimation approach (line 7) and a quadraturebased collision probability estimation approach (line 8) in order to predict the influence of process noises and observation noises during trajectory executions. This prediction as well as the idea of risk allocation plays the role of extracting conflicts and guiding the deterministic planner to approach to a feasible solution whose execution failure rate is bounded by the chance constraint. This is the main innovation of this pChekov planning and execution system. In addition, the application of risk reallocation (line 14) is key to the speed of pChekov’s convergence to a feasible solution trajectory.
V Experiments and Results
In order to test the performance of pChekov, we conducted simulation experiments on Baxter [23] in two tabletop environments that were previously developed in [2]: a “tabletop with a pole” and a “tabletop with a container” environment. The test cases in the second environment are much more difficult due to the narrow spaces inside the container. In addition, unlike the ones in the first environment, these cases include difficult poses where the joints are close to their limits. 500 pairs of start and goal poses in each environment were selected to compare the performance of pChekov and deterministic Chekov. The test cases where the estimated collision risk of either the start or the goal has already exceeded 150% of the chance constraint were filtered out, since those cases are very likely to be infeasible. We set the chance constraint as 10% and run 100 executions to test the solutions for each of the 500 test cases.
Va Experiment Models
We assume all the joints are fully actuated, and linearize the manipulator dynamics around its nominal trajectory. The control inputs are the accelerations at each time step. We assume the joint dynamics are independent from each other, corrupted by process noise , where denotes the DOF index. We define:
(16)  
as the deviations of states and inputs from the nominal trajectory. Use to denote the state variable of the th joint, then the dynamics for each joint can be linearized as:
(17) 
In manipulation tasks, the relative spatial relationship between the endeffector and the object to be grabbed is important for task success. The transformation matrix between workspace objects and the endeffector can be expressed as:
(18) 
where is the transformation from the workspace object to the camera frame, and is the transformation from the camera frame to the endeffector. Therefore, the noises for observing can be transformed into observation noises for through the transformation matrix . Then can be transformed into through matrix inversion. Therefore, we can approximate the observation noises by corrupted observations of the endeffector pose through the camera.
The observations of the endeffector can be expressed in joint space through the nonlinear relationship:
(19) 
where is the forward kinematics, is the observation noise, and is the covariance matrix of the observation noise. The linearization of this observation model around the nominal trajectory point can be expressed as:
(20) 
where
(21) 
Since is the forward kinematics, is the endeffector Jacobian matrix at the nominal state . In this way, the system observation matrix becomes the Jacobian matrix, which is usually easy to obtain during computation.
If we define deviations from the nominal observations as:
(22) 
then the linearized system observation model shown in Equation 20 can be rewritten as:
(23) 
which is the endeffector observation model we need.
VB Experiment Results
The analysis of experiment results focuses on pChekov’s comparison with deterministic Chekov and its chance constraint satisfaction performance. Within 100 independent executions for a particular solution trajectory which has a 10% probability of collision, the number of failure cases follows a binomial distribution with the number of independent experiments
and the probability of occurrence in each experiment . The cumulative probability distribution function of binomial distributions can be expressed as:(24) 
Then we can calculate that having less than or equal to 10 failures out of 100 executions has a probability of 56%, whereas the probability of having within 15 failures is 94%. Therefore, we decide to define chance constraint satisfied test cases as the ones where the collision rate is lower than or equal to 150% of the chance constraint. In this way, we have much more confidence to say that a solution violates the chance constraint when there are more than the corresponding number of executions end up in collision.
Since theoretically pChekov only has probabilistic guarantees for waypoints in a trajectory, we distinguish between continuoustime and discretetime chance constraint satisfaction performances. If the 100 noisy executions of a test cases shows that the average continuoustime (or waypoint) collision rate is within 150% of the collision chance constraint, then we say this case satisfies the continuoustime (or discretetime) chance constraint. Only the continuoustime satisfaction is the true criterion for success, but we use discretetime performance to show the impact of edge collisions, i.e. the collisions in between waypoints.
Figure 2 shows a statistical breakdown for the results in the “tabletop with a container” environment with a 10% chance constraint and a 50 iteration limit. Here the continuoustime chance constraint satisfaction rate is altogether 82.20%, including 17.00% where the initial deterministic solution has already satisfied the chance constraint. 4.60% of the cases fail for edge collisions, which is an inevitable outcome of the discretization of trajectories. Hence the balance between edge collision and computation complexity is crucial when deciding the number of waypoints.
Table I provides detailed results of pChekov’s performance in both environments. The first six rows compare the performance of deterministic Chekov (roadmap + TrajOpt) and pChekov’s planning phase algorithm, while the remaining fourteen focus on pChekov’s chance constraint satisfaction performance. On average, pChekov takes longer to plan, and returns paths with longer execution trajectories. This is expected because it adjusts the deterministic solution iteratively in order to satisfy the chance constraint and can often push the solution trajectory further away from the locally optimal solution. The eighth and eleventh rows show that the failure cases where pChekov struggles to find feasible solutions cost more time, whereas the success cases have a much shorter average planning time.
Environment  Tabletop with a Pole  Tabletop with a Container  
Planning Time (s)  deterministic Chekov  1.10  1.27  
pChekov  19.34  31.17  
Overall Collision Rate^{1}  deterministic Chekov  27.51%  41.04%  
pChekov  11.39%  16.46%  
Average Path Length (rad)^{2}  deterministic Chekov  0.51  0.60  
pChekov  0.68  0.84  
PChekov Performance  continuous chance constraint satisfaction rate^{3}  87.60%  82.20%  
continuous satisfied cases^{4}  average planning time (s)  15.40  22.95  
average collision rate  0.08%  0.11%  
average risk reduction^{6}  0.25  0.33  
continuous violated cases  average planning time (s)  46.22  67.12  
average collision rate  88.50%  88.02%  
average risk reduction  0.44  0.13  
discrete chance constraint satisfaction rate^{5}  94.40%  86.80%  
discrete satisfied cases  average planning time (s)  18.76  25.15  
average collision rate  0.13%  0.10%  
average risk reduction  0.19  0.28  
discrete violated cases  average planning time (s)  28.08  68.78  
average collision rate  73.39%  86.59%  
average risk reduction  0.39  0.23 

Average collision rate over 100 noisy executions for all 500 test cases.

Average length of execution trajectories.

Percentage of test cases where the average continuoustime collision rate over 100 noisy executions satisfies the chance constraint.

PChekov performance over the test cases where chance constraint is satisfied by continuoustime collision rate (viewed as success cases).

Percentage of test cases where the average waypoint collision rate over 100 noisy executions satisfies the chance constraint.

The difference between the average collision rate of pChekov solutions and that of deterministic Chekov solutions.
The comparison of the overall collision rate, the average over all the 100 executions for 500 cases, shows the superiority of pChekov solutions. The overall collision rate in the “tabletop with a pole” environment is within 150% of the chance constraint, and in the “tabletop with a container” environment it exceeds by only a small amount. Row nine and row twelve show that this excess is mainly from the failure cases. The collision rate of success cases are very low, which means pChekov’s planning phase produces conservative solutions, and that’s why an anytime planning improving approach is in demand in the execution phase.
As shown in the seventh row, the continuoustime chance constraint satisfaction rates are above 80% in both environments. For discretetime chance constraint the percentages are both above 85%. In addition, the tenth row tells us that in the satisfied cases pChekov successfully reduces the average collision rate by over 0.25. The increased collision risk in failure cases is probably caused by pChekov’s struggle to find safe solutions that satisfy the chance constraint in these difficult cases, where the trajectories are pushed into some obstacles while pChekov is trying to get around others.
Vi Discussion
This paper introduced pChekov, a chance constrained motion planning and execution system that can be applied to highDOF robotic systems under motion uncertainty and imperfect state information. Through the “roadmap + TrajOpt” framework as well as the quadraturebased risk estimation and the risk reallocation approaches, it overcame existing riskaware planners’ limitation in realtime motion planning tasks with highDOF robots in 3dimensional nonconvex environments. Simulation tests showed that pChekov had high chance constraint satisfaction rate and showed a much lower collision rate compared with deterministic solutions.
However, solutions from pChekov’s planning phase can be overly conservative due to the suboptimal risk allocations as well as the limited number of quadrature nodes. Although not included in this paper, preliminary experimental results show that applying an IRAbased anytime plan improving algorithm to the execution phase can effectively improve the solution utility. The completion of this IRAbased execution phase algorithm is an important track of our future work. Other future work directions include improving the collision probability estimation algorithm, developing more intelligent constraints that guide pChekov to avoid conflicts, and incorporating risk information into the roadmap nodes as a heuristic to search for low risk seed trajectories.
References
 [1] M. Ono and B. C. Williams, “Iterative risk allocation: A new approach to robust model predictive control with a joint chance constraint,” in Decision and Control, 2008. CDC 2008. 47th IEEE Conference on. IEEE, 2008, pp. 3427–3432.
 [2] S. Dai, M. Orton, S. Schaffert, A. Hofmann, and B. C. Williams, “Improving trajectory optimization using a roadmap framework.” in Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2018, (accepted).
 [3] M. Ono and B. Williams, “An efficient motion planning algorithm for stochastic dynamic systems with constraints on probability of failure.” 2008.
 [4] S. Thrun, W. Burgard, and D. Fox, Probabilistic robotics. MIT press, 2005.
 [5] J. Burlet, O. Aycard, and T. Fraichard, “Robust motion planning using markov decision processes and quadtree decomposition,” in Robotics and Automation, 2004. Proceedings. ICRA’04. 2004 IEEE International Conference on, vol. 3. IEEE, 2004, pp. 2820–2825.
 [6] R. Alterovitz, T. Siméon, and K. Y. Goldberg, “The stochastic motion roadmap: A sampling framework for planning with markov motion uncertainty.” in Robotics: Science and systems, vol. 3, 2007, pp. 233–241.
 [7] H. Kurniawati, D. Hsu, and W. S. Lee, “Sarsop: Efficient pointbased pomdp planning by approximating optimally reachable belief spaces.” in Robotics: Science and systems, vol. 2008. Zurich, Switzerland., 2008.
 [8] J. Van Den Berg, S. Patil, and R. Alterovitz, “Motion planning under uncertainty using iterative local optimization in belief space,” The International Journal of Robotics Research, vol. 31, no. 11, pp. 1263–1278, 2012.
 [9] M. Ono, B. C. Williams, and L. Blackmore, “Probabilistic planning for continuous dynamic systems under bounded risk,” Journal of Artificial Intelligence Research, vol. 46, pp. 511–577, 2013.
 [10] L. Blackmore, H. Li, and B. Williams, “A probabilistic approach to optimal robust path planning with obstacles,” in American Control Conference, 2006. IEEE, 2006, pp. 7–pp.
 [11] L. Blackmore, M. Ono, A. Bektassov, and B. C. Williams, “A probabilistic particlecontrol approximation of chanceconstrained stochastic predictive control,” IEEE transactions on Robotics, vol. 26, no. 3, pp. 502–517, 2010.
 [12] B. Luders, M. Kothari, and J. How, “Chance constrained rrt for probabilistic robustness to environmental uncertainty,” in AIAA guidance, navigation, and control conference, p. 8160.
 [13] W. Liu and M. H. Ang, “Incremental samplingbased algorithm for riskaware planning under motion uncertainty,” in Robotics and Automation (ICRA), 2014 IEEE International Conference on. IEEE, 2014, pp. 2051–2058.
 [14] A. Bry and N. Roy, “Rapidlyexploring random belief trees for motion planning under uncertainty,” in Robotics and Automation (ICRA), 2011 IEEE International Conference on. IEEE, 2011, pp. 723–730.
 [15] J. Van Den Berg, P. Abbeel, and K. Goldberg, “Lqgmp: Optimized path planning for robots with motion uncertainty and imperfect state information,” The International Journal of Robotics Research, vol. 30, no. 7, pp. 895–913, 2011.
 [16] S. Patil, J. Van Den Berg, and R. Alterovitz, “Estimating probability of collision for safe motion planning under gaussian motion and sensing uncertainty,” in Robotics and Automation (ICRA), 2012 IEEE International Conference on. IEEE, 2012, pp. 3238–3244.
 [17] F. B. Hildebrand, Introduction to numerical analysis. Courier Corporation, 1987.
 [18] J. Schulman, Y. Duan, J. Ho, A. Lee, I. Awwal, H. Bradlow, J. Pan, S. Patil, K. Goldberg, and P. Abbeel, “Motion planning with sequential convex optimization and convex collision checking,” The International Journal of Robotics Research, vol. 33, no. 9, pp. 1251–1270, 2014.

[19]
W. Hoeffding, H. Robbins, et al.
, “The central limit theorem for dependent random variables,”
Duke Mathematical Journal, vol. 15, no. 3, pp. 773–780, 1948.  [20] J. Schulman, J. Ho, A. X. Lee, I. Awwal, H. Bradlow, and P. Abbeel, “Finding locally optimal, collisionfree trajectories with sequential convex optimization.” in Robotics: science and systems, vol. 9, no. 1. Citeseer, 2013, pp. 1–10.

[21]
M. L. Eaton, “Multivariate statistics: a vector space approach.”
JOHN WILEY & SONS, INC., 605 THIRD AVE., NEW YORK, NY 10158, USA, 1983, 512, 1983.  [22] M. Abramowitz and I. A. Stegun, Handbook of mathematical functions: with formulas, graphs, and mathematical tables. Courier Corporation, 1964, vol. 55.
 [23] RethinkRobotics, “Baxter,” http://www.rethinkrobotics.com/baxter/.