Bipedal locomotion software for the humanoid robot platform iCub.
This paper contributes towards the development and comparison of Divergent-Component-of-Motion (DCM) based control architectures for humanoid robot locomotion. More precisely, we present and compare several DCM based implementations of a three layer control architecture. From top to bottom, these three layers are here called: trajectory optimization, simplified model control, and whole-body QP control. All layers use the DCM concept to generate references for the layer below. For the simplified model control layer, we present and compare both instantaneous and Receding Horizon Control controllers. For the whole-body QP control layer, we present and compare controllers for position and velocity control robots. Experimental results are carried out on the one-meter tall iCub humanoid robot. We show which implementation of the above control architecture allows the robot to achieve a walking velocity of 0.41 meters per second.READ FULL TEXT VIEW PDF
This paper contributes towards the benchmarking of control architectures...
Last decades of humanoid research has shown that humanoids developed for...
A common approach to the generation of walking patterns for humanoid rob...
Inflated continuum robots are promising for a variety of navigation task...
Whole Body Operational Space Control (WBOSC) is a pioneering algorithm i...
In this paper, a legless capsule robot (capsubot) comprised of a sealed
Achieving short-distance flight helps improve the efficiency of humanoid...
Bipedal locomotion software for the humanoid robot platform iCub.
Bipedal locomotion of humanoid robots remains an open problem despite decades of research in the subject. The complexity of the robot dynamics, the unpredictability of its surrounding environment, and the low efficiency of the robot actuation system are only a few problems that complexify the achievement of robust robot locomotion. In the large variety of robot controllers for bipedal locomotion, the Divergent-Component-of-Motion (DCM) is an ubiquitous concept used for generating walking patterns. This paper presents and compares different DCM based control architectures for humanoid robot locomotion.
During the DARPA Robotics Challenge, a common approach for humanoid robot control was that of defining an hierarchical architecture composed of several layers . Each layer generates references for the layer below by processing inputs from the robot, the environment, and the outputs of the layer before. From top to bottom, these layers are here called: trajectory optimization, simplified model control, and whole-body quadratic programming (QP) control.
The trajectory optimization layer often generates desired foothold locations by means of optimization techniques. To do so, both kinematic and dynamical robot models can be used [2, 3]. When solving the optimization problem associated with the trajectory optimization layer, computational time may be a concern especially when the robot surrounding environment is not structured. There are cases, however, where simplifying assumptions on the robot environment can be made, thus reducing the associated computational time. For instance, flat terrain allows one the view the robot as a simple unicycle [4, 5], which enables quick solutions to the optimization problem for the walking pattern generation .
The simplified model control layer is in charge of finding feasible center-of-mass (CoM) trajectories and is often based on simplified dynamical models, such as the Linear Inverted Pendulum Model (LIPM)  and the Capture Point (CP) 
. These models have become very popular after the introduction of the Zero Moment Point (ZMP) as a stability criterion. To obtain feasible CoM trajectories, the simplified model control layer often combines the LIPM with Model Predictive Control (MPC) techniques, also known as the Receding Horizon Control (RHC) [10, 11]. Another model that is often exploited in the simplified model control layer is the Divergent Component of Motion (DCM) . The DCM can be viewed as the extension of the capture point (CP) to the three dimensional case, however always under the assumption of a constant Virtual Repellent Point (VRP) to Enhanced Centroidal Moment Pivot point (eCMP) height difference . Attempts at loosening this latter assumption and extending the DCM to more complex models have also been presented .
The whole-body QP control layer generates robot positions, velocity or torques depending on the available control modes of the underlying robot. These outputs aim at stabilizing the references generated by the layers before. It uses whole-body kinematic or dynamical models, and very often instantaneous optimization techniques, namely, no MPC methods are here employed. Furthermore, the associated optimisation problem is often framed as an hierarchical stack-of-tasks, with strict or weighted hierarchies [14, 15].
This paper presents and compares several DCM based implementations of the above layered control architecture. In particular, the trajectory optimization layer is kept fixed with a unicycle based planner that generates desired DCM and foot trajectories. The simplified model control layer, instead, implements two controllers for the tracking of the DCM: an instantaneous and an MPC controller. In the same layer, we also present a controller that ensures the tracking of the CoM and the ZMP, which exploits 6-axes Force Torque sensors (F/T). Finally, the whole-body QP control ensures the tracking of the desired CoM and feet trajectories. It achieves this by presenting velocity and inverse kinematic based controllers. The several combinations of the control architecture are tested on the iCub humanoid robot  .
The paper is organized as follows. Sec. II introduces notation, the humanoid robot model, and some simplified models used for locomotion. Sec. III describes each layer of the control architecture, namely the trajectory optimization, the simplified model control and and the whole-body QP control layer. Sec. IV presents the experimental validation of the proposed approach, and shows an explanatory table comparing the different control approaches. Finally, Sec. V concludes the paper.
and are used to denote respectively the identity and zero matrices;
denotes an inertial frame;
is the canonical vector, consisting of all zeros but the-th component that is equal to one;
given two frames, and , represents the rotation matrix between the frames, i.e. given two vectors respectively expressed in and , the rotation matrix is such that ;
given a skew-symmetric matrixthe vee operator is ;
denotes the angular velocity between the frame and the frame expressed in the frame ;
given a square matrix the skew operator is , ;
the subscripts , , and indicates the frames attached to the torso, left foot, right foot and CoM.
A humanoid robot is an example of floating base multibody systems composed of links connected by
joints with one degree of freedom. Since none of the links of the robot has an a priori constant position and orientation with respect to a global frame, its configuration can be determined by the position and the orientation of the base frame (e.g. the stance foot during the locomotion) and the joints values. Thus the configuration space is defined by . An element of is then a triplet , where is used to represent the position and the orientation of the base frame, , expressed with respect to the inertial frame ; and represents the joint angles. Furthermore, according to the group theory, is a Lie group. Indeed given two elements and the group multiplication is defined as , while the inverse of is given by: . Since is a Lie group, the velocity of the multibody system is represented by the Lie algebra . An element of the Lie algebra is a triplet where is the angular velocity of the base with respect the inertial frame whose coordinates are written in the inertial frame, i.e. .
Given a link of the floating base system its position and orientation with respect to the inertial frame is uniquely identified by an homogeneous transformation, , between the inertial frame, , and the frame attached to that link, . is the classical homogeneous transformation containing the rotation matrix, , and the vector connecting the origin of the inertial frame to origin of the frame expressed in the inertial frame .
Given a link of the floating base system the velocity of a rigid body can be represented by .
The Jacobian is the map between the robot velocity and the linear and angular velocities of the frame , i.e. where can be split into two parts, one related to the velocity of the base and the other one multiplying the joints velocity, .
For the purpose of this work, the motion of the humanoid robot is approximated by means of the well known Linear inverted pendulum model (LIPM) . By using the LIPM, in case of walking on a flat surface the CoM belongs to an horizontal plane with a constant height . The simplified CoM dynamics is given by :
where is the vector containing the projection of the CoM on the walking surface, is the position of the zero moment point (ZMP) and is the inverse of the pendulum time constant, i.e. where is the gravity constant.
Analogously, one can define the divergent component of motion (DCM) as . Clearly, the DCM time derivative is given by:
Using the DCM as state variable, the LIPM dynamics (1) can be decomposed into two parts:
Performing the state space decomposition we can easily show that the CoM tends the DCM, while the DCM dynamics has a strictly positive real part eigenvalue.
This section summarizes the components of the control architecture presented in Fig. 2. In particular, the control architecture is composed of three main layers. The first layer is represented by the trajectory generator, whose main purpose is to generate desired footstep positions and orientation and also the desired DCM trajectory. The second layer employs simplified robot models to track the desired DCM, CoM and ZMP trajectories. Finally, the third control layer is given by the whole-body QP block. It has the main purpose of ensuring the tracking of the desired feet positions and orientations and also the CoM trajectories. This, differently from the previous control layer, exploits whole-body robot kinematic models and feedback.
The main purpose of this layer is to evaluate the desired footstep positions and the desired feet and DCM trajectories.
To plan the desired footstep positions, the humanoid robot is approximated as an unicycle . The feet are represented by the unicycle wheels, and the footsteps can be obtained through sampling of the unicycle trajectories. In particular, given the finite set of unicycle positions, we can sample particular feet positions. Each position is associated with a time instant . This time instant is considered as the foot impact time
. Thus, the impact time can be used as a decision variable, which allows us to select the feet position and avoid fast/slow step duration and long/short step length. Furthermore, the footsteps are planned to avoid excessive rotation between the two feet positions that could be unfeasible because of joint limits. Once the footsteps are planned, the desired feet trajectory is obtained by cubic spline interpolation.
The DCM is chosen so as to satisfy the following time evolution:
where is the initial position of the DCM, is the position of the ZMP (placed on the center of the stance foot) and has to belong to the step domain where is the duration of the -th step. Assuming that the final position of the DCM coincides with the ZMP at last step (i.e. ), (4) can be used to find the desired DCM position at the end of each step 
where and are respectively the desired DCM initial and final positions for the -th step.
The DCM velocity can be easily evaluated via differentiation of (6): . In light of the above, the DCM trajectory along the walking pattern can be computed recursively. The presented planning method is very powerful and it allows us to generate the desired DCM trajectory in real-time. Nevertheless, it has the main limitation of taking into account single support phases only. Indeed, by considering instantaneous transitions between two consecutive single support phases, the ZMP reference is discontinuous. This leads to the discontinuity of the external forces and also of the desired joint torques. The development of a DCM trajectory generator that handles non-instantaneous transitions between two single support phases becomes pivotal .
Since the ZMP is related to the DCM position and velocity, namely , a DCM reference trajectory with continuous derivative ( continuity) guarantees a continuous ZMP trajectory. This can be easily ensured using a third order polynomial to smooth the edges of the DCM reference trajectory evaluated using the exponential technique :
where the parameters for have to be chosen in order to satisfy the velocity and position boundary conditions, namely:
Here and are the desired DCM position and velocity at the beginning of the double support phase, and are the desired DCM position and velocity at the end of the double support phase, and are the initial and final instant of the double support phase.
As discussed in Sec. II-C, the simplified model (3) shows that the CoM asymptotically converges to a constant DCM, while the DCM has an unstable first order dynamics. Then, we present in this section control laws that aim at stabilizing the DCM dynamics only assuming the as control input.
This stabilization problem has been tackled by designing and comparing instantaneous and predictive (MPC) control laws. Differently from the MPC, the instantaneous control law does not solve any optimization problem and it uses only the current desired and actual position of the DCM to evaluate its output.
Differently from , we propose an instantaneous control law that does not perform a cancellation of the unstable system dynamics, which, in practice, is never perfect and may lead to system instabilities. More precisely, we choose
It is easy to show the DCM error and its integral converge asymptotically to zero. The above law (7) is very simple to implement and guarantees the tracking of the desired DCM. The main limitation is that may not ensure the feasibility of the gait since the position of the ZMP may exit the support polygon. Furthermore, the above instantaneous control law does not take into account the future planned desired DCM trajectory.
In order to overcome these issues, a model predictive controller can be designed .
In the MPC framework, the DCM dynamics (2) is used as a prediction model and it is discretized supposing piecewise constant ZMP trajectories:
In order to ensure that the stance foot does not rotate around one of its edges, the desired ZMP must not exit the support polygon . This is verified by means of a set of linear inequality constraints:
where and are time variant and their dimension depends on the type of support.
The cost function shall then ensure the tracking of the desired trajectory. In particular, we choose the following cost:
where , and are positive symmetric matrices and is length of the preview window.
The terms (13a) (11a) guarantee the tracking of the given reference trajectory, while (12a) is added for obtaining a smoother ZMP trajectory. Even if the cost function is time-dependent, it is always positive and convex.
The MPC problem can be summarized as follows:
where satisfies , is the desired ZMP computed at the previous control iteration and
is the estimated DCM position.
Since the cost function is a quadratic positive function and the constraints are linear, the optimal control problem is quadratic and it can be converted into a strictly convex quadratic programming problem (QP) of the form:
Here the optimization variables are stacked inside the vector . The Hessian matrix and the gradient vector can be obtained from (11a). The inequality constraint matrix and vector embed the ZMP constraint (10). While the equality constraints, and are determined using the prediction model (9).
Independently from the chosen DCM controller, namely either the controller in Sec. III-B1 or in III-B2, one obtains a desired ZMP and CoM position and velocity to be stabilised. As consequence, another control loop is needed after the DCM controller. In this paper, we choose the proposed control law , i.e.:
The main control objective for the whole-body QP control layer is to stabilise some robot kinematic quantities by using the entire robot body. We here choose to track the position of the CoM, the torso orientation, and the left and right feet position and orientation. To do so, we use a stack of tasks formulation. The tracking of the feet poses and of the CoM position is considered as high priority tasks (hard constraint), while the torso orientation is considered as a low priority task (soft constraint). Furthermore, a postural condition is also added as a low-priority task. Then, the following cost function is defined:
with and where . This latter control law guarantees almost-global stability and convergence of to .
The postural task (16), with , is achieved by asking for a desired joints velocity that depends on the error between the desired and measured joints position
where is a positive definite matrix.
Concerning the hard constraints, we have:
where is the linear velocity of the CoM, and are respectively the desired left foot and right foot velocities. More specifically , where , is chosen as:
Here , while the gains , , are positive definite matrices.
Finally, the desired velocity of the CoM is chosen as:
where the gain matrices are positive definite positive is the output of the ZMP-CoM (15) controller and is the integrated signal. Finally, we add constraints on the maximum velocity
The above hierarchical control objectives can be cast into a whole-body optimization problem:
Since the decision variable is the robot velocity and the body velocity depends, through the Jacobian matrices, linearly on the optimization problem can be converted to the QP problem of the form:
The Hessian matrix and the gradient vector are evaluated from (16). The constraint matrix and vector and are obtained from (18). Using this formulation the optimization problem can be solved using a standard numerical QP solver.
It is important to notice that the outcome of (III-C) is (also) the robot joint velocity. When a robot velocity controller is available, one can set these joint velocities to the low level robot controller. In this case, the * quantities in (III-C) can be evaluated by using robot sensor feedback, and the robot is said to be velocity controlled. On the other hand, if the robot velocity control is not available, one may integrate the outcome of (III-C) to obtain desired joint position to be set to a low level robot position controller. In this case, the * quantities in (III-C) can be evaluated by using the desired integrated quantities instead of sensor feedback, and (III-C) behaves as an inverse kinematics module, and the robot is said to be position controlled.
In this section, we present experiments obtained with several implementations of the control architecture shown in Fig. 2. We use the iCub humanoid robot  to carry out the experimental activities. Let us recall that iCub is tall, with a foot length and width of and , respectively.
The control architecture runs on the iCub head’s computer, namely a 4-th generation Intel ® Core i7 @ . In any of its implementations, the architecture takes (in average) less than for evaluating its outputs. OSQP  library was used to solve the optimization problems.
Tab. I summarizes the maximum velocities achieved using the different implementations of the control architecture. In particular, the labels instantaneous and predictive mean that the associated layer generates its outputs considering inputs and references either at the single time or for a time window, respectively. The labels, velocity and position control, instead, mean that the layer outputs are either desired joint velocities or position, respectively – see Sec. III-C1.
Let us remark that all the implemented control architectures exploit the controller presented in section III-B3 to attempt the stabilization of the desired center-of-pressure and desired center-of-mass position and velocity. The performances of this controller much depend on the gains and . In particular, we observed that the gains achieving good tracking during standing and walking were not the same. For this reason, we implemented a gain-scheduling technique depending on the robot is walking or standing. The transition between the two sets of gains is smoothed with a minimum jerk trajectory .
Comparing different control architectures, however, is a far cry from being an easy task. To do so, we decided to perform two main experiments only, which are used as benchmarks for all control architecture implementations. These two experiments represent the maximum robot velocity that has been achieved with all architectures, and the maximum velocity achieved with a specific architecture only – see Tab. I. Namely,
a forward robot speed of ;
a forward robot speed of .
In this section, we compare the control laws (7) and (III-B2), which both generate a (desired) center-of-pressure that attempt at stabilizing a desired DCM. To simplify the comparison, the controller of the whole-body QP layer is kept fixed in this section, and we show and discuss only the results when the robot is position controlled. The time horizon of the predictive control is .
Figs. 3(a) and 3(d) depict the DCM tracking performances obtained with the instantaneous and predictive controllers, respectively. Both controllers seem to show good tracking performances, and the DCM error is kept below in both cases. Note that the instantaneous controller induces, however, faster variations of the measured DCM. This contributes to overall higher vibrations of the robot. One of the reasons for this variation is that the instantaneous controller (7) injects a (desired) center-of-pressure proportional to the measured DCM , which in turns contains the center-of-mass velocity. To mitigate this, we suggest to filter joint velocities appropriately. In our case, however, joint velocities were not filtered to avoid delays in the measured DCM. Our experience showed that adding a filter on joint velocities is not an easy task, and we did not find the right trade off for obtaining overall performance improvements.
Figs. 3(b) and 3(e), depict the CoM tracking performances, which are mainly dependent on the ZMP-CoM controller (15). This controller receives desired DCM values from the simplified model control layer, which are obtained with either the instantaneous or predictive controllers. In both cases, CoM tracking performances are good, and the CoM error is kept below .
Figs. 3(c) and 3(f) depict the ZMP tracking performances, which are still mainly dependent on the ZMP-CoM controller (15). It is important to observe that the desired ZMP is smoother when the simplified model control uses the predictive law (III-B2) to generate it. This is a tunable property that depends on the associated weight in the cost function of the MPC problem. Although this smoother behaviour does contribute to less robot vibrations, the overall robot performance became less reactive and, consequently, less robust to robot falls. Although the extensive hand-made tuning, we were not able to increase the robot velocity when the simplified model control used the predictive law (III-B2).
At a robot desired walking speed of , there is initially no significant difference between the DCM tracking obtained with instantaneous and predictive control laws – see Fig. 4(d) and 4(a) for . However, fast robot walking velocities require fast variations of the desired CoM and ZMP. In the case of the predictive controller, this fast variation induces a not-very-good performance of the desired ZMP around – see Fig. 4(f). Clearly, these bad performances in turn induce a bad tracking of the DCM shown in Fig. 4(d) at , and consequently a robot fall. At this point, one is tempted to increase the gain of the controller (15), which shall induce a better tracking of the ZMP. This unfortunately gives raise to higher robot oscillations, which in turn degrades the ZMP-CoM tracking, and still a robot fall.
In a nutshell, the predictive simplified control is much less robust than the instantaneous simplified control with respect to ZMP tracking errors. We suggest to filter the measurements from force sensors to obtain less noisy ZMP measurement, which should allow one to increase the gains for ZMP tracking. In our case, adding filters led to slower system response and, consequently, a robot fall.
In this section, we compare the performances of the three layer architecture when the robot is either position or velocity controlled – see Sec. III-C1 for the meaning of these modes. For comparison purposes, the simplified model control is achieved via the instantaneous control (7). Also, to emphasize the comparison, we stress the importance of the tracking of the desired feet positions.
Figs. 5(a) and 5(c) depict the tracking of desired left foot positions when the robot is either position or velocity controlled, respectively. The position controller ensures better tracking performance than the velocity one. One is then tempted to increase the gains of the * quantities (19), which should increase the tracking performances of the velocity control. However, we observed that the noise due to numerical derivative is harmful for the overall performance, and makes the robot shake and then fall. Again, filtering the taken joint measures may be helpful, but it introduced a delay that did not allow us to increase the walking speed.
The aforementioned foot position tracking problem worsens at higher walking velocity. Fig. 5(b) shows that the feet tracking error is lower than on the axis and on the one for position control. Instead, the velocity control in Fig. 5(d) keeps the error always lower than on the component and on the direction.
This paper contributes towards the benchmarking of different implementations of state-of-the-art control architectures for humanoid robots locomotion. In particular, the control architecture is composed of three layers, which all exploit the concept of the Divergent Component of Motion. The three layers are here called: trajectory optimization, simplified model control, the whole-body QP control. A key feature of this paper is that we compare walking results obtained with predictive and instantaneous controllers for the simplified model layer. Also, we compare position and velocity robot control modes for the whole-body QP control layer. We show that instantaneous controllers combined with robot position control allowed us to achieve a desired walking speed of , which is the highest walking velocity ever achieved for the iCub humanoid robot.
As future work, we plan to extend the proposed benchmarking to torque-control algorithms (e.g. ), and to propose architecture implementations allowing the robot walking on inclined planes. Another interesting future work is the implementation of a dynamic footstep planner. Indeed, currently, the footsteps are generated independently from the state of the robot. A dynamic planner can plan the footprints according to the external disturbance acting on the humanoid, i.e. an unknown contact force acting on the robot.