A change in task-specification is often unavoidable in real-world manipulation problems. For example, consider a scenario where a manipulator is handing over an object to a human. The robot’s estimate of the goal position can change as it executes its prior computed trajectories. Consequently, it needs to quickly adapt its joint motions to reach the new goal position. In this paper, we model motion planning as a parametric optimization problem wherein the task specifications are encoded in the parameters. In this context, adaptation to a new task requires re-computing the optimal joint trajectories for the new set of parameters. This is a computationally challenging process as the underlying cost functions in typical manipulation tasks are highly non-linear and non-convex[dimitry_manifold_planning]. Existing works leverage the so-called warm-starting technique where prior computed trajectories are used as initialization for the optimization solvers [memory_of_motion_mansard]. However, our extensive experimentation with off-the-shelf optimization solvers such as Scipy-SLSQP [scipy] show it is not sufficient for real-time adaptation of joint trajectories to task perturbations.
I-a Main Idea
The proposed work explores an alternate approach based on differentiating the optimal solution with respect to the problem parameters, hereafter referred to as the Argmin differentiation [argmin_paper]. To understand this further, consider the following constrained optimization problem over variable
(e.g joint angles) and parameter vectorp (e.g end-effector position).
The optimal solution satisfies the following Karush-Kuhn Tucker (KKT) conditions.
The gradients in (4a) is taken with respect to . The variables are called the Lagrange multipliers. Now, consider a scenario where the optimal solution for the parameter p needs to be adapted for the perturbed set . As mentioned earlier, one possible approach is to resolve the optimization with as the warm-start initialization. Alternately, for with a small magnitude, an analytical perturbation model can be constructed. To be precise, we can compute the first-order differential of the r.h.s. of (4a)-(4d) to obtain analytical gradients in the following form [hauser_opt_map], [hauser_bilevel], [andreas_sensolve_1].
Multiplying the gradients with gives us an analytical expression for the new solution and Lagrange multipliers corresponding to the perturbed parameter set. [andreas_sensolve_1].
Algorithmic Contribution: A critical bottleneck in using gradient map of the form (5) to compute perturbed solutions is that the mapping between and is highly discontinuous. In other words, even a small can lead to large changes in the so-called active-set of the inequality constraints. Thus it becomes necessary to develop additional active-set prediction mechanisms [andreas_sensolve_1]. In this paper, we by-pass this complication by instead focusing on the parametric optimization with only bound constraints on the variable set. Argmin differentiation of such problem has a simpler structure, which we leverage to develop a line-search based algorithm to incrementally adopt joint trajectories to larger changes in the parameter/tasks 111To give some example of "large perturbation", our algorithm can adapt the joint trajectories of Franka-Panda arm to perturbation of up to 30 cm in the goal position. This is almost of the workspace of the Franka arm. .
Application Contribution: For the first time, we apply the Argmin differentiation concept to the problem of joint trajectory optimization for the manipulators under end-effector task constraints. We consider a diverse class of cost functions to handle (i) perturbations in joint configurations or (ii) end-effector way-points in orientation-constrained end-effector trajectories. We present an extensive benchmarking of our algorithm’s performance as a function of the perturbation magnitude. We also show that our algorithm outperforms the warm-start trajectory optimization approach in computation time by several orders of magnitude while achieving similar quality as measured by task residuals and smoothness of the resulting trajectory.
I-C Related Works
The concept of Argmin differentiation has been around for a few decades, although often under the name of sensitivity analysis [sensitivity_nlp], [sensitivity_ipopt]. However, off-late it has seen a resurgence, especially in the context of end-to-end learning of control policies [brandon_diffmpc], [boyd_l4dc]. Our proposed work is more closely related to those that use Argmin differentiation for motion planning or feedback control. In this context, a natural application of Argmin differentiation is in bi-level trajectory optimization where the gradients of the optimal solution from the lower level are propagated to optimize the cost function at the higher level. This technique has been applied to both manipulation and navigation problems in existing works [hauser_bilevel], [bi_level_man]. Alternately, Argmin differentiation can also be used for correction of prior-computed trajectories [andreas_sensolve_1], [l_1_mpc_sensitivity].
To the best of our knowledge, we are not aware of any work that uses Argmin differentiation for adaptation of task constrained manipulator joint trajectories. The closest to our approach is [hauser_opt_map] that uses it to accelerate the inverse kinematics problem. Along similar lines, [andreas_sensolve_1] considers a very specific example of perturbation in end-effector goal position. In contrast to these two cited works, we consider a much diverse class of task constraints. Furthermore, our formulation also has important distinctions with [andreas_sensolve_1] at the algorithmic level. Authors in [andreas_sensolve_1] use the log-barrier function for including inequality constraints as penalties in the cost function. In contrast, we note that in the context of the task-constrained trajectory optimization considered in this paper, the joint angle limits are the most critical. The velocity and acceleration constraints can always be satisfied through time-scaling based pre-processing [pham_timescaling]. Thus, by choosing a way-point parametrization for the joint trajectories, we formulate the underlying optimization with just box constraints on the joint angles. This, in turn, allows us to treat this constraint through simple projection (Line 4 in Algorithm 1) without disturbing the structure of the cost function and the resulting Jacobian and Hessian matrices obtained by Argmin differentiation.
Ii Proposed Approach
Ii-a Symbols and Notations
We will use lower case normal font letters to represent scalars, while bold font variants will represent vectors. Matrices are represented by upper case bold fonts. The subscript will be used to denote the time stamp of variables and vectors. The superscript will represent the transpose of a matrix.
Ii-B Argmin Differentiation for Unconstrained Parametric Optimization
We consider the optimal joint trajectories to be the solution of the following bound-constrained optimization with parameter p.
We are interested in computing the Jacobian of with respect to p. If we ignore the bound-constraints for now, we can follow the approach presented in [argmin_paper] to obtain them in the following form.
Using (7), we can derive a local model for the optimal solution corresponding to a perturbation as
Intuitively, (8) signifies a step of length along the gradient direction. However, for (8) to be valid, the step-length needs to be small. In other words, the perturbed parameter needs to be in the vicinity of p. Although it is difficult to mathematically characterize the notion of "small", in the following, we attempt a practical definition based on the notion of optimal cost.
A valid is one that satisfies the following relationship
The underlying intuition in (9) is that the perturbed solution should lead to a lower cost for the parameter as compared to for the same perturbed parameter.
Ii-C Line Search and Incremental Adaption
Algorithm 1 couples the concept from the definition (8) with a basic line-search to incrementally adapt (8) to a large . The Algorithm begins by initializing the optimal solution and the parameter with prior values for iteration . These variables are then used to initialize the Hessian and Jacobian matrices. The core computations takes place in line 2, wherein we compute the least amount of scaling that needs to be done to step length to guarantee a reduction in the cost. At line 3, we update the optimal solution based on step-length obtained in line 2, followed by a simple projection at line 4 to satisfy the minimum and maximum bounds. At line 5, we perform the called forward roll-out of the solution to update the parameter set. For example, if the parameter p models position of the end-effector at final time instant of a trajectory, then line 5 computes how close the takes the end-effector to the perturbed goal position . On lines and , we update the Hessian and the Jacobian matrices based on the updated parameter set and optimal solution.
Iii Task Constrained Joint Trajectory Optimization
This section formulates various examples of the task-constrained trajectory optimization problem and uses the previous section’s results for optimal adaptation of joint trajectories under task perturbation. To formulate the underlying costs, we adopt the way-point parametrization and represent the joint angles at time as . Furthermore, we will use to describe the end-effector position and orientation in terms of Euler angles respectively.
Iii-a Orientation Constrained Interpolation Between Joint Configurations
The task here is to compute an interpolation trajectory between a given initialand a final joint configuration while maintaining a specified orientation for the end-effector at all times. We model it through the following cost function.
The first term in the cost function models smoothness in terms of joint angles from to [toussaint_GN]. For example, for , the smoothness is defined as the first-order finite difference of the joint positions at subsequent time instants. Similarly, , will model higher order smoothness through second and third-order finite differences respectively. We consider all three finite-differences in our smoothness cost term. The second term ensures that the interpolation trajectory is close to the given initial and final points. The final term in the cost function maintains the required orientation of the end-effector.
We can shape (13) in the form of (6a) by defining . The bounds will correspond to the maximum and minimum limits on the joint angles at each time instant. We define the parameter set as . That is, we are interested in computing the adaptation when either or both of and gets perturbed.
Adaptation of of (13) for different has applications in learning from demonstration setting where the human just provides the information about the initial and/or final joint configuration and the manipulator then computes a smooth interpolation trajectory between the boundary configurations by adapting a prior computed trajectory.
Fig. 1 presents an example of adaptation discussed above. The prior computed trajectory is shown in blue. This is then adapted to two different final joint configurations. The trajectory computed through Algorithm 1 is shown in green while that obtained by resolving the optimization problem (with warm-starting) is shown in red.
Iii-B Orientation-Constrained Trajectories Through Way-Points
The task in this example is to make the end-effector move though given way-points while maintaining the orientation at . Let represent the desired way-point of the end-effector at time . Thus, we can formulate the following cost function for the current task.
The first two terms in the cost function are same as the previous example. The changes appear in the final term which minimizes the norm of the distance of the end-effector with the desired way-point. The defintion of remains the same as before. However, the parameter set is now defined as .
Collision Avoidance As shown in Fig. 2, a key application of the adaptation problem discussed above is in collision avoidance. A reactive planner such as [reactive_torsten] can provide new via-points for the manipulator to avoid collision. Our Algorithm 1 can then use the cost function (14) to adapt prior trajectory shown in blue to that shown in green. For comparison, the trajectory obtained with resolve of the trajectory optimization is shown in red.
Human-Robot Handover: Algorithm 1 with cost function (14) also finds application in human-robot handover tasks. An example is shown in Fig. 3, where the manipulator adapts the prior trajectory (blue) to a new estimate of the handover position. As before, the trajectory obtained with Algorithm 1 is shown in green, while the one shown in red corresponds to a re-solve of the trajectory optimization with warm-start initialization.
Iv-a Implementation Details
The objective of this section is to compare the trajectories computed by Algorithm 1 with that obtained by re-solving the trajectory optimization for the perturbed parameters with warm-start initialization. We consider the same three benchmarks presented in Fig. 1-3 implemented on a 7dof Franka Panda Arm, but for a diverse range of perturbations magnitude. For each benchmark, we created a data set of 180 trajectory by generating random perturbations in the task parameters. For the benchmark of Fig. 1, the parameters are the joint angles but in the following we use the forward kinematics to derive equivalent representation for the parameters in terms of end-effector position values.
Each joint trajectory is parameterized by a 50-dimensional vector of way-points. Thus, the underlying task constrained trajectory optimization involves a total of variables. We use Scipy-SLSQP [scipy] to obtain the prior trajectory and also to re-solve the trajectory optimization for the perturbed parameters. We did our implementation in Python using Jax-Numpy [jax] to compute the necessary Jacobian and Hessian matrices. We also used the just-in-time compilation ability of JAX to create an on-the-fly complied version of our codes. The line-search in Algorithm 1 (line 2) was done through a parallelized search over a set of discretized
values. The entire implementation was done on a 32 GB RAM i7-8750 desktop with RTX 2080 GPU (8GB). To foster further research in this field and ensure reproducibility, we open-source our implementation for review athttps://rebrand.ly/argmin-planner
Iv-B Quantitative Results
Orientation Metric: For this analysis, we compared the pitch and roll angles at each time instant along trajectories obtained with Algorithm 1 and the resolving approach. Specifically, we computed the maximum of the absolute difference (or norm) of the two orientation trajectories. The yaw orientation in all these benchmarks was a free variable and is thus not included in the analysis. The results are summarized in Fig. 4-6. The histogram plot in these figures are generated for the medium perturbation ranges (note the figure legends). For the Fig. 1 benchmark related to cost function (13), Fig. 4 shows that all the trajectories obtained by Algorithm 1 have norm of the orientation difference less than 0.1 rad. For the benchmark of Fig. 2, which we recall involves perturbing the via-point of the end-effector trajectory, the histograms of Fig. 5 show similar trends. All the trajectories computed by Algorithm 1 managed a similar orientation difference. For the benchmark of Fig. 3 pertaining to perturbation of the final position of the trajectories obtained by Algorithm 1 managed to maintain a orientation difference of 0.1 rad with the resolving approach.
Task residuals ratio metric: For this analysis, we compare the task residual between trajectories obtained from Algorithm 1 and the resolving approach. For example, for the benchmark of Fig. 1, we want that the manipulator final configuration should be close to the specified value (recall cost (13)) while maintaining desired orientation at each time instant. Thus, we compute the residual of for Algorithm 1 and compare it with that obtained from the resolving approach. Now, just like before and to be consistent with the other benchmarks, we convert the residual of joint angles to position values through forward kinematics. Similar analysis follow for the other benchmarks as well. For the ease of exposition, we divide the task residual of Algorithm 1 by that obtained with the resolving approach. A ratio greater than 1 implies that the former led to a higher task residual than the latter and vice-versa. Similarly, a ratio closer to 1 implies that both the approaches performed equally well.
The results are again summarized in Fig. 4-6. From Fig. 4, we notice that of trajectories have residual ratio less than . For the experiment involving via-point perturbation in Fig. 5, the performance drops to for the same value of residual ratio. Meanwhile, as shown in Fig. 6, around of the trajectories have residual ratio less than in the case of final position perturbation benchmark of Fig.3.
Velocity Smoothness Metric: For this analysis, we computed the difference of the velocity smoothness cost ( norm of first-order finite difference) between the trajectories obtained with Algorithm 1 and the resolving approach. The results are again summarized in Fig. 4-6. For all the benchmarks, in around of the examples, the difference was less than 0.05. This is of the average smoothness cost observed across all the trajectories from both the approaches.
represent the first quartile, median and the third quartile of the three metrics discussed above for different perturbation ranges.
For the benchmark of Fig. 1, trajectories from Algorithm 1 maintains an orientation difference of less than 0.1 rad with the trajectories of the resolving approach for perturbations as large as 40 cm. The difference of smoothness cost for the same range is also small with median value being in the order of . The median task residuals achieved by Algorithm 1 is only higher than that obtained by the resolving approach. For the benchmark of Fig. 2, the performance remains same on the orientation metric but the median difference in smoothness cost and task residual ration increases to 0.04 and for the largest perturbation range. The benchmark of Fig. 6 follow similar trend in orientation and smoothness metric but performs significantly worse in task residuals. For the largest perturbation range, Algorithm 1 leads to higher median task residuals. However, importantly, for perturbation up to 30 cm, the task residual ratio is close to 1 suggesting that Algorithm 1 performed as good as the resolving approach for these perturbations.
Computation Time: Table I contrasts the average timing of our Algorithm 1 with the approach of resolving the trajectory optimization with warm-start initialization. As can be seen, our Argmin differentiation based approach provides a worst-case speed up of 160x on the benchmark of Fig. 3. For the rest of the benchmarks, this number varies between 500 to 1000. We believe that this massive gain in computation time offsets whatever little performance degradation in terms of orientation, smoothness, and task residual metric that Algorithm 1 incurs compared to the resolving approach. Note that the high computation time of the resolving approach is expected, given that we are solving a difficult non-convex function with 350 decision variables. Even highly optimized planners like [dimitry_manifold_planning] show similar timings on closely related benchmarks [const_x].
V Conclusions and Future Work
We presented a fast, near real-time algorithm for adapting joint trajectories to task perturbation as high as 40 cm in the end-effector position, almost half the radius of the Franka Panda arm’s horizontal workspace used in our experiments. By consistently producing trajectories similar to those obtained by resolving the trajectory optimization problem but in a small fraction of a time, our Algorithm 1 opens up exciting possibilities for reactive motion control of manipulators in applications like human-robot handover.
In future works, we will extend our formulation to problem with dynamic constraints such as torque bounds. We conjecture that by coupling the way-point parametrization with a multiple-shooting like approach, we can retain the constraints as simple box-bounds on the decision variables and consequently retain the computational structure of the Algorithm 1. We are also currently evaluating our Algorithm’s performance on applications such as autonomous driving.