Optimizing Dynamic Trajectories for Robustness to Disturbances Using Polytopic Projections

03/01/2020 ∙ by Henrique Ferrolho, et al. ∙ 0

This paper focuses on robustness to disturbance forces and uncertain payloads. We present a novel formulation to optimize the robustness of dynamic trajectories. A straightforward transcription of this formulation into a nonlinear programming problem is not tractable for state-of-the-art solvers, but it is possible to overcome this complication by exploiting the structure induced by the kinematics of the robot. The non-trivial transcription proposed allows trajectory optimization frameworks to converge to highly robust dynamic solutions. We demonstrate the results of our approach using a quadruped robot equipped with a manipulator.



There are no comments yet.


page 1

page 6

page 7

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

When an external force is applied to a legged robot with a manipulator it may cause the robot to slip, or to fail to track a path with its end-effector. Similarly, the performance degrades when the robot poorly estimates how slippery the ground is or how heavy is its payload. In either case the motion fails because completing the task while compensating for the external force requires the robot to either command more torque to its actuators than they are capable of delivering, to produce unrealistic contact forces, or both. These limitations impose constraints that the robot motion has to satisfy. Therefore, one way to look at robustness is to define it as some metric of distance to these constraints, for instance, as the force the robot can compensate for before violating the motion constraints. This kind of robustness could be optimized over by the robot controller, however, considering robustness during motion planning would allow us to avoid difficult to execute motions altogether.

We tackle the problem of robustness against external perturbations and unmodeled payloads, for complex legged robots with manipulation capabilities, at the planning level. In our previous work, we used the SUF applied at some link of the robot as a robustness metric for improving single configurations via convex conic optimization. In this work, we propose a novel formulation to make the computation more tractable and versatile, allowing us to consider the optimization of entire trajectories with nonlinear dynamics. Our new computational framework enables us to combine TO with the SUF metric to produce highly robust and dynamic trajectories.

Fig. 1: A legged loco-manipulation system: ANYmal [1] is a fully torque-controlled quadruped robot. We equipped it with a Kinova Jaco [2] robot arm. An accompanying video is available at https://youtu.be/KK-ZCWsEGGg.

I-a Related Work

[3] presented a motion planning and control framework for a platform similar to ours (see Fig. 1). The authors demonstrated successful execution of tasks such as opening a door and carrying a box alongside a human. The authors addressed robustness to external disturbances with an inverse dynamics-based whole-body controller and by re-planning locomotion continuously in a receding-horizon fashion. However, contrary to our approach, they did not take into account robustness explicitly at the planning-level.

[4] proposed a solution to improve the robustness to joint-torque tracking errors at the control stage. The authors modeled deterministic and stochastic uncertainties in joint torques within their control framework optimization. Their idea is similar to what we present in this paper, but we maximize the upper-bound force magnitude the system can withstand from any possible direction—and we do this during planning.

The authors of [5] included external forces estimation directly into their hierarchical controller. Their objective was to minimize actuator torques while enforcing constraints for the contact forces. However, contrary to our work, they did not enforce actuator limitations explicitly.

Modeling the capabilities of the system explicitly using polytopes has recently become more popular than using simplified metrics for robustness. In [6], the authors derived the equations of a so-called GIWC. It is a feasible region used as a general stability criterion. This representation is very efficient for testing robust static equilibrium of a legged robot, but it fails to take into account any actuation limits. [7] proposed to incorporate the properties of [6] with system torque limits. They use the resulting polytopes to optimize the com trajectory in the -plane for the base-transfer motion of a quadruped. Despite the reduced size of this problem, the technique used to compute polytopes was prohibitively expensive, and as a workaround they computed polytopes once at the beginning and used them as an approximation for the remaining motion.

We have followed this line of research in our previous work [8] and we proposed a force polytope representation considering system dynamics: the residual force polytope. The polytope is computed from the forces and torques remaining after accounting for Coriolis, centrifugal, and gravity terms, as well as nominal motion feed-forward torques.

All the polytope calculations proposed in the literature [9, 7, 8] require a significant amount of computation time. In general, deriving an explicit description of a projected polytope is NP-hard [10]. As a result, prior work using polytopes in trajectory optimization, e.g., [7], resorted to the approximation of fixing the polytope for an entire trajectory.

[11] recently formulated a computationally tractable approach for finding maximally sized convex bodies inscribed in a projected polytope. Their scheme does not require an explicit description of the projection and works by combining Fourier-Motzkin elimination with techniques from adjustable robust optimization. The scheme was adapted for robustness computations in robotics in [12], where the SUF were estimated for static configurations. However, despite an improvement over exact computation, due to the computational complexity of their formulation it was not previously possible to consider trajectory optimization of full system dynamics and maximization of robustness based on dynamic polytopes at the same time. We further adapt the technique of [11] to reformulate the problem of computing the SUF. The resulting reformulation allows to consider trajectory optimization and robustness maximization in a bilevel optimization setting.

Bilevel optimizations are mathematical programs that include the solution to other programs in their constraints or objectives. They are common in robustness settings, and have been used for robust control of robots. [13] optimized trajectories with full dynamics for robustness as a bilevel problem; it is particularly related to our work, but with some key differences: they considered robustness to noise and dealt with a fixed base manipulator—both differences allowed for simplifications in their optimization problem.

I-B Statement of Contributions

We present a TO framework capable of planning robust and dynamic manipulation tasks for legged robots, such as the one shown in Fig. 1. Our main contributions are:

  1. Proposal of a novel solution to a bilevel optimization problem that marries dynamic trajectory optimization with maximization of robustness against disturbances.

  2. Explanation of the non-trivial transcription and reformulation required to make this problem tractable for a NLP solver.

  3. Comparison of our method’s results against a traditional optimization objective across different scenarios.

  4. Validation of the planned motions using both full-physics simulation and real-life hardware experiments.

We will also make our framework/implementation available upon acceptance/publication of this paper.

Ii Trajectory Optimization

Trajectory Optimization (TO) is a well-known and powerful framework for planning locally-optimal trajectories of dynamic systems such as legged robots subject to constraints. TO falls under the broader category of optimal control problems. In general, TO aims to design a finite-time control trajectory as a function of time, , which drives the system from an initial state towards a final state , and given the system dynamics which must be satisfied over the entire interval . Optimal control problems can be solved using dynamic programming or by means of transcription (see [14]).

In this work, we employ a technique called direct transcription because it readily handles strict constraints on states and controls. Such constraints take a key role in computing the SUF. The main alternative, DDP, offers faster computation and provides a linear controller next to the optimized trajectory. However, handling constraints with DDP is a challenging subject of research [15, 16]. This currently makes DDP less applicable to our case. A second alternative, shooting methods, have been reported to result in slower computations and higher susceptibility to local optima [14]. Using direct transcription, we formulate the continuous optimization problem by explicitly discretizing the system state and control trajectories. This method results in the formulation of a large and sparse NLP problem [14]. The resulting constrained nonlinear optimization problem can then be solved using a sparse, large-scale nonlinear programming solver such as Knitro [17].

Iii Model Formulation

The model of a legged robot can be formulated as a free-floating base to which limbs are attached. For the specific case of the robot shown in Fig. 1, the kinematic tree stemming from the base branches into four legs and one robotic manipulator with six DoF. The motion of the system can be described wrt a fixed inertial frame . Let us express the position of the base wrt the inertial frame, expressed in the inertial frame, as . Let be a Hamiltonian unit quaternion defining the orientation of the free-floating base wrt the inertial frame, and let be the MRP [18, 19] of the unit quaternion .111 is the affinely extended set of real numbers. We use the same notation as [19]. We use to parameterize the orientation of the free-floating base.222 The MRP encode a 3D rotation with the stereographic projection of a Hamiltonian unit quaternion. The derivatives of the rotation matrix wrt the MRP parameters are rational functions, making this representation a particularly good choice for purposes of differentiation and optimization.

The joint angles describing the configuration of the 6-DoF arm and the four 3-DoF legs are stacked in a vector

, where . The generalized coordinates vector and the generalized velocities vector of this floating-base system may therefore be written as


where and the twist encodes the linear and angular velocities of the base w.r.t. the inertial frame expressed in the and frames, respectively. The equations of motion of a floating base system that interacts with the environment are written as


where is the mass matrix and is the vector of Coriolis, centrifugal, and gravity terms. The selection matrix selects which DoF are actuated. Here, as all limb joints are actuated. The vector of ground-feet contact forces and contact torques is mapped to joint-space torques through the support Jacobian , which is obtained by stacking the Jacobians which relate generalized velocities to limb end-effector motion as , with being the number of limbs in contact and the total dimensionality of all contact wrenches. We assume ANYmal has point-feet and thus we only model linear contact forces at the feet. Finally, represents any external force applied to the end-effector. This force may be the result of a push or some unpredicted disturbance. In a nominal scenario, this force is zero, i.e., . The Jacobian maps a linear force applied at the end-effector to joint-space torques.

Iv Problem Formulation

We transcribe the continuous optimization problem by explicitly discretizing the system state and the control trajectory using a direct transcription technique. We divide the trajectory into equally spaced segments or intervals


where the points are referred to as mesh points.333Some authors also refer to these mesh points as nodes, knots, way points, or grid points. The number of mesh points is given by . Henceforth, we use and to indicate the value of the state and control variables, respectively, at mesh point . We treat the values of and as a set of NLP variables, and we finally formulate the general TO problem as:


where and form an optional cost function, is a fixed integration step size, and and are the sets of feasible states and inputs, respectively. We use the explicit Euler method to integrate the differential equations of the system dynamics, but other K-stage Runge-Kutta schemes could be used, e.g., the Trapezoidal method (implicit, ) or the Hermite-Simpson method (implicit, ).

Iv-a Parameterization

Similarly to [20], we directly optimize over the space of feasible states, control inputs, and constraint forces, i.e., for each discretized mesh point , the vectors of generalized coordinates , generalized velocities , control inputs , and contact forces form the vector of decision variables . The entire vector of NLP decision variables is:


Treating the contact forces as optimization parameters increases the number of decision variables and constraints but the problem becomes more tractable and better conditioned.

Iv-B Objectives

We consider three different optimization objectives . The first objective corresponds to the feasibility problem, i.e., a problem with constraints but without any cost to minimize.

The second objective achieves the minimization of actuator torques and is defined as


Finally, objective corresponds to the maximization of the SUF at the end-effector. involves a problem reformulation which is explained in Section V-A—the main contribution presented in this manuscript.

Iv-C Constraints

We now analyze the constraints formulated in the NLP in detail. TABLE I shows a summary of these constraints.

Iv-C1 Bounds on Decision Variables

We constrain the joint positions, velocities, and torques to be within their respective lower and upper bounds with (7)–(9).


We further fix the initial and final velocities to zero:


Iv-C2 Friction Cones

Similarly to [6], we model friction at the contact points using an inner linear approximation with a four-sided friction pyramid. Consider the set of points where the robot is in contact with its environment. Let and be the unit normal and the friction coefficient of the support region at each contact, respectively. A point contact remains fixed as long as its contact force lies inside the linearized friction cone directed by :


where form the basis of the tangential contact plane such that is a direct frame.

Iv-C3 System Dynamics

We enforce the nonlinear system dynamics () with a finite set of defect (or gap) constraints in our NLP formulation:


Iv-C4 Stationary Feet

Let the forward kinematics function for a foot-point contact be given by .


Constraining the feet positions with equality constraints works for a 1-stage Runge-Kutta integration scheme, but for higher-order schemes constraining the feet acceleration is preferred.

Iv-C5 Gripper Task

The gripper is constrained at the initial and final instants of the trajectory ( and ). For both of these mesh points, there exist five constraints: three to constrain the placement of the end-effector, and two for constraining the pitch and roll describing the orientation of the end-effector. This enforces a specific location for the pick and placing of a bottle (e.g., see Fig. 2), as well as the correct orientation of the fingers to embrace it, while leaving the grasp yaw

as a degree of freedom to the solver.

Fig. 2: The figure shows the robot at the beginning of a pick-and-place task. The ground-feet contact forces are shown in yellow and the friction cones are shown in red.
Constraint Structure Relation
Bounds on Linear Mixed
Friction Cones Linear Inequality
System Dynamics Nonlinear Equality
Stationary Feet Nonlinear Equality
Gripper Task Nonlinear Equality
TABLE I: Summary of the formulated NLP constraints.

Building on the formulation above, we now want to implement additional terms to model the external disturbances. We are interested in maximizing the forces applied at the end-effector that the robot can compensate for while still satisfying all the NLP constraints from TABLE I.

V Robustness to Disturbances

External forces applied to the robot can cause the robot to slip, lose contact between a foot and the environment, or to fail to track the desired end-effector path. In each of these cases, the motion fails because the external force causes a violation of one of the motion constrains. We therefore define robustness as some metric of distance to the constraints. More specifically, we consider the friction cone constraints on the contact forces and the actuator torque bounds limiting the control commands that can be used to compensate for external forces. As pointed out by [7], when transformed into a common reference frame, these constraints form a convex polytope bounding the volume of all admissible external wrenches applied to the robot. In [8] we proposed a geometric method to inscribe a ball into the polytope and use its radius as a metric of robustness. This approximation is especially useful since the radius of the maximum volume inscribed ball gives a bound for the magnitude of forces from any direction that the system can compensate for without violating the constraints.

V-a Maximum Volume Inscribed Ball of a Polytopic Projection

In order to reject a disturbance force, additional motor torques and ground reaction forces are needed. Our robustness metric, the SUF, is the smallest force for which no reaction forces/torques exist that also satisfy friction-cone constraints and motor limitations. In previous work [8], the SUF was computed via a LP problem. The trajectory optimization would have this LP problem inside its objective, which is not desirable. Hence, we propose a new way to compute the SUF. The key idea in this robustness analysis is to approximate the relationship between the disturbance force and reaction forces/torques as affine. Adapting the results from [11], we find a practically efficient way to simultaneously optimize the robustness metric and the affine relationship prescribing it. These results go beyond earlier adaptations in robotics by [12], as those, like our work relying on LP, were not suitable for use in a trajectory optimization setting.

Let us define the extended torques and ground-feet contact forces as and , respectively:


where, and are the nominal torques and ground-feet contact forces, and are some gain matrices which map a force expressed in end-effector space to joint-torque space and ground-feet contact space, respectively, and is some external force applied at the end-effector. In a nominal situation there are no forces applied to the end-effector, i.e., and thus and . Replacing and in the equations of motion (2) with the right-hand side of (16) and (17) leads to:


Alternatively to constraints (9) and (11)–(12), we can represent the actuator torque bounds and friction cones constraints using and as:


We then substitute (16)–(17) into (19)–(20) and for each row of matrix we write the constraint as:


where is the radius of the maximum volume inscribed ball of a polytopic projection. We then define , where is a vector with unit length, which allows us to find the greatest with:


The objective function of the left-hand side of equation (22) can be seen as a scalar product of the vectors and , which is greatest when these vectors are collinear:


Simplifying (22) with the right-hand side of (23) leads to:


Equations (21)–(24) address the constraints on actuation limits. We repeat the same process for the ground-feet contact forces to obtain:


Next, we substitute and into (18), (24) and (25) and write:


This substitution removes the bilinear products between , and while keeping the equality and inequalities valid. For fixed trajectories, the above constraints are convex.

V-B Constraints’ Structure Exploitation

We now extend the problem formulation and transcribe the constraints (26)–(28) directly into NLP constraints and we extend the vector of decision variables with , , and . However, by trying this, one will soon realize we face an NP-hard problem. Additionally, there is a significant increase in the amount of decision variables, and the nonlinear constraint (26) introduces non-convex terms. Ultimately, this quickly renders any efforts of a naïve transcription ineffective, as the solver would be unable to digest it.

In order to solve this issue, we have to analyze the inherent structure of the problem and its constraints. Let be the unknown in constraint (26). Splitting the structure of the constraint as


highlights that can be obtained as a function of and without performing any inversions. Doing this satisfies the bottom equality implicitly. The top part of the equality affecting the floating base still needs to be enforced.

This key-insight into the structure of the constraints allows us to transcribe the problem so that the solver will converge successfully.

(a) “Steps” terrain.
(b) “Ramp” terrain.
(c) “Valley” terrain.
(d) “Handstand” scenario.
Fig. 3: We set up a multitude of terrains for testing a pick-and-place task: flat ground, slabs at different heights (a), and inclined supports (b)–(c). An extreme scenario where the robot performs a “handstand” is shown in (d). The plots underneath each scene show the SUF (in newtons) for the trajectories computed using different optimization objectives.

V-C NLP Reformulation

V-C1 Parameterization

We extend the previous definition of to accommodate for the extra decision variables required. Recall that need not be discretized.


V-C2 Objective

is the sum of all the in :


V-C3 Constraints

We bound all the in to with a linear one-sided inequality:


We enforce the top part of constraint (29) explicitly:


Finally, (27) is rewritten as:


A summary of the constraints added to the NLP with the reformulation is shown in TABLE II.

Constraint Structure Relation
Bounds on Linear Inequality
Equation (33) Nonlinear Equality
Equation (34) Nonlinear Inequality
Equation (28) Conic Inequality
TABLE II: Summary of the reformulated NLP constraints.

Vi Performance evaluation

In order to evaluate our work, we compared the robustness of the three objective functions proposed in Section IV-B: feasibility (), minimum torques (), and maximum SUF (). We ran the comparison across different scenarios for a pick-and-place task. Furthermore, we benchmarked the times taken to evaluate all NLP constraints and the convergence times for problems of different sizes.

Fig. 3 shows four different settings for a pick-and-place task of a bottle on a table. We set up scenarios with challenging terrain, where the robot stands on steps with different heights or inclined slabs. The trajectories optimized with our method () demonstrated greater robustness, as shown in plots (a)–(c) in Fig. 3.

Fig. 4:

Mean and standard deviation of the SUF at the end-effector for varying inclinations on the “ramp” scenario.

We also verified the robustness of trajectories for different inclines. For this, we varied the grade of the slopes in the “ramp” scenario (Fig. 2(b)) from to . The trajectories computed with our metric consistently showed a greater SUF for all the tested slopes (see Fig. 4).

(a) Position
(b) Velocity
(c) Effort
Fig. 5: Joint positions, velocities, and torques of ANYmal for a 2-seconds long trajectory on flat ground. The dotted lines correspond to the planned trajectory. The solid lines show the data collected by the state estimation on the real robot.

As an extreme example, we created a scenario where the robot has to perform a “handstand”, i.e., support its own weight on two of its legs (see Fig. 2(d)) while using the remaining two for keeping its balance. In this scenario, it is especially important for the robot to press downwards against the floor and upwards against the ceiling to maintain stability. Because of this, torque minimization () is not an appropriate objective for this scenario, as confirmed by the degraded SUF when compared to the initial seed in plot (d). On the other hand, our method is able to increase the robustness of the initial seed by a small amount, because it allows to trade off torque expenditure for more stable ground/ceiling-feet contact forces. We would like to emphasize that the motion in all of the scenarios is within the actual physical capabilities of the robot, even the extreme “handstand” scenario.

Constraint Function () Jacobian ()
Gripper Task
Stationary Feet
System Dynamics
SUF Constraints
TABLE III: Times taken to evaluate the NLP constraints.

TABLE III shows the computation times for function and Jacobian evaluations of the NLP problem constraints. The longest time is spent computing the Jacobian of the system dynamics. Evaluating the Jacobian of the SUF—which is involved when optimizing —takes the second longest time.

() () ()
TABLE IV: Convergence times of objectives for problems with different size: , , and mesh points ().

TABLE IV shows the total time it takes for the solver to converge for problems of different size. We benchmarked problems with , , and mesh points (for a 1-second trajectory, this is the equivalent of a discretization at , , and ). Each average and standard deviation are taken from five samples. is the fastest to solve as it is a feasibility problem and does not consider any optimality function. In our tests, the overall best robustness of (shown in Fig. 3) also comes with the trade-off of the longest times required until convergence.

All evaluations in this section were carried out in a single-threaded process on an Intel i7-6700K CPU with and memory. The proposed optimization framework has been implemented using Julia [21] and the optimization library Knitro [17]. The chosen solving algorithm was the interior-point method555 Interior-point methods (also known as barrier methods) replace the NLP problem by a series of barrier subproblems controlled by a barrier parameter. They are generally preferable for large-scale problems.  presented by [22].

Vii Experiments

We conducted hardware experiments on an ANYmal [1] quadruped equipped with a Kinova Jaco [2] robot arm. The motion planning is performed a priori and the optimized trajectories are then sent to the controller for playback.

Vii-a Robot Control

To execute our whole-body motions, we tracked the joint position with feed-forward velocity and torque. We updated the references for joint position, joint velocity, and joint torque along with proportional and derivative gains at . The decentralized motor controller at every joint closes the loop compensating for friction effects. During our experiments, we used as proportional and as derivative joint space gains for each leg, respectively. The arm used Kinova’s driver for joint trajectory control. We synchronized the execution of both controllers.

In order to evaluate how well the real robot tracks motions using our controller, we compared the planned joint states over time with the state estimation data from the robot. We computed a 2-seconds long trajectory using our framework for a pick-and-place task sampled at and commanded the robot at the same frequency. Fig. 5 shows the plots of the planned trajectory against the data collected during our experiments. The plots show that joint positions are within acceptable tolerances and joint velocities are tracked well, but joint efforts are significantly different. This validates that the motions generated by our trajectory optimization are dynamically consistent. The mismatch in joint efforts is expected due to differences between the real robot and our model, and also due to signal delays. Additionally, as we executed our trajectory open-loop without re-planning, the errors accumulated. Nonetheless, the tracking controller can execute the dynamic motions we planned.

Vii-B Description of the Experiments

We executed the pick-and-place of a bottle on a table for different terrain: on flat ground and on a “ramp” (Fig. 5(a)). The object being grasped was not modeled and it is therefore an external disturbance. We also tested the trajectories for ground-feet friction coefficient mismatches by placing a skateboard underneath the feet of the robot (Fig. 5(b)). A video is available at https://youtu.be/KK-ZCWsEGGg.

(a) “Ramp” terrain
(b) “Skateboard” scenario
Fig. 6: Snapshots of the real robot executing the planned motions on a ramp (see Fig. 2(b)) and on a skateboard.

For the motions shown in the video, we optimized the trajectories at

and then linearly interpolated them to

. It was the interpolation result that was then tracked by the controller. We did this because computing optimal trajectories with gets more computationally expensive as the problem discretization increases (see TABLE IV).

To select the frequency of the trajectory before interpolation, we computed the root-mean-square error (RMSE) of the SUF over time for the same trajectory using different discretization resolutions, with a resolution as a baseline. As shown in Fig. 7, for a trajectory discretized at its SUF , which is acceptable for our purposes.

Fig. 7: Root-mean-square error (RMSE) in newtons of the SUF given different discretizations, for a baseline.

Viii Future Work

Robustness Through Environment Exploitation: This work has focused on isotropic robustness, but for some tasks it may be more appropriate to be able to resist disturbances along particular directions, e.g., as in Fig. 8. Based on our work [8], it should be straightforward to adapt Section V-A to maximize either the volume of an ellipsoid or the magnitude of a specific vector inscribed in the projection of the dynamic force polytope.

Fig. 8: Left: Door rotating over its hinges as a pivot point. Right: Door sliding back and forth on a frame.

Robust Dynamic Loco-manipulation: In this work, we did not study scenarios involving the making or breaking of support contacts with the environment. It would be interesting to adapt the trajectory optimization transcription to allow for multi-contact tasks or for dynamic loco-manipulation.

Minimization of Robustness Loss: Our framework maximizes the robustness of a trajectory assuming reliable execution. However, the system dynamics around a nominal trajectory are not linear, and given large enough perturbations the robot may be driven into areas of low robustness. This can be alleviated by accounting for neighboring states and taking the control noise into account.


We would like to thank Twan Koolen, João Moura, Romeo Orsolino, François Pacaud, Theodoros Stouraitis, and Matt Timmons-Brown for their valuable help and feedback.