Online DCM Trajectory Generation for Push Recovery of Torque-Controlled Humanoid Robots

09/23/2019 ∙ by Milad Shafiee, et al. ∙ 0

We present a computationally efficient method for online planning of bipedal walking trajectories with push recovery. In particular, the proposed methodology fits control architectures where the Divergent-Component-of-Motion (DCM) is planned beforehand, and adds a step adapter to adjust the planned trajectories and achieve push recovery. Assuming that the robot is in a single support state, the step adapter generates new positions and timings for the next step. The step adapter is active in single support phases only, but the proposed torque-control architecture considers double support phases too. The key idea for the design of the step adapter is to impose both initial and final DCM step values using an exponential interpolation of the time varying ZMP trajectory.This allows us to cast the push recovery problem as a Quadratic Programming (QP) one, and to solve it online with state-of-the-art optimisers. The overall approach is validated with simulations of the torque-controlled 33 kg humanoid robot iCub. Results show that the proposed strategy prevents the humanoid robot from falling while walking at 0.28 m/s and pushed with external forces up to 150 Newton for 0.05 seconds.



There are no comments yet.


page 1

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Bipedal locomotion of humanoid robots remains an active research domain despite decades of studies in the subject. The complexity of the robot dynamics, the unpredictability of its surrounding environment, and the low efficiency of the robot actuation system are only a few issues that complexify the achievement of robust robot locomotion. In the large variety of robot controllers implemented for bipedal locomotion, the Divergent-Component-of-Motion (DCM) is an ubiquitous concept used for generating walking patterns. In case of unforeseen disturbances acting on the robot, it is necessary to adjust the generated pre-planned patterns from preventing the robot to fall. This paper contributes towards this direction by presenting algorithms and architectures for achieving bipedal humanoid robot with real-time push recovery features.

A popular approach to humanoid robot control during the DARPA Robotics Challenge was to define a hierarchical architecture consisting of several layers [feng2015optimization]. Each layer generates references for the layer below by processing inputs from the robot, the environment, and the outputs of the previous layer. From top to bottom, these layers are here called: trajectory optimization, simplified model control, and whole-body quadratic programming (QP) control.

The trajectory optimization layer often generates desired foothold locations by means of optimization techniques. To do so, both kinematic and dynamical robot models can be used [dai2014whole, herzog2015trajectory]. When solving the optimization problem associated with the trajectory optimization layer, computational time may be a concern especially when the robot surrounding environment is not structured. There are cases, however, where simplifying assumptions on the robot environment can be made, thus reducing the associated computational time. For instance, flat terrain allows one to view the robot as a simple unicycle [PascalHandbook, flavigne2010reactive], which enables quick solutions to the optimization problem for the walking pattern generation [8594277].

Fig. 1: iCub reacts to the application of an external force (the red arrow).

The simplified model control layer is in charge of finding feasible center-of-mass (CoM) trajectories and is often based on simplified models, such as the Linear Inverted Pendulum Model (LIPM) [Kajita2001] and the Capture Point (CP) [Pratt2006, Hof2008]

. These models have become very popular after the introduction of the Zero Moment Point (ZMP) as a stability criterion 

[Vukobratovic1969]. Another model that is often exploited in the simplified model control layer is the Divergent Component of Motion (DCM) [Englsberger2015]. The DCM can be viewed as the extension of the capture point (CP) concept to the three dimensional case.

The whole-body QP control layer generates robot’s joint torques depending on the available control modes of the underlying robot. These outputs aim at stabilizing the references generated by the previous layers. It uses whole-body kinematic and/or dynamical models, and very often instantaneous optimization techniques, namely, no MPC methods are here employed. Furthermore, the associated optimisation problem is often framed as a hierarchical stack-of-tasks, with strict or weighted hierarchies [Stephens2010, nava16].

In recent years, several attempts have been made for considering robot state feedback for footprints planning. The new foothold locations can be optimized assuming constant step timing [feng2016robust, Shafiee-Ashtiani2017, joe2018balance] or adaptive step duration [khadiv2016step, griffin2017walking]. These methods have the drawback to consider a constant Zero Moment Point (ZMP) during the single support phase. Attempts at relaxing this assumption involve the control of the Centroidal Momentum Pivot (CMP) position [jeong2019robust1], but the pre-planned ZMP is still not used during push recovery. The possibility of having a non-constant pre-planned ZMP during single support is fundamental for achieving toe-off motion: it indeed contributes towards humanoid robots with energy efficient and human-like walking [kuo2002energetics, ogura2006human, Englsberger2014].

This paper extends and encompasses the control architecture [8594277, romualdi2018benchmarking] by complementing it with the push recovery feature. In particular, the proposed methodology fits control architectures where the DCM is planned beforehand, and adds a step adapter to adjust the planned trajectories and achieve push recovery. Assuming the robot in a single support state, the step adapter generates new positions and timings for the next step. The step adapter is active in single support phases only, but the proposed torque-control architecture considers double support phases too. The key idea for the design of the step adapter is to impose both initial and final DCM step values using an exponential interpolation of the time varying ZMP trajectory. This allows us to cast the push recovery problem as a QP one, and to solve it online with state-of-the-art optimisers. The approach is validated with simulations of the torque-controlled humanoid robot iCub. Results show that the proposed strategy prevents the robot from falling while walking at and pushed with external forces up to 150 Newton for 0.05 seconds applied at the robot pelvis.

The paper is organized as follows. Sec. II introduces notation, the humanoid robot model, and some simplified models used for locomotion. Sec. III describes each layer of the control architecture, namely the trajectory optimization, the simplified model control and the whole-body QP torque control layer. Sec. IV details the the step timing and position adaptation algorithm while Sec. V presents the simulation results on the iCub humanoid robot [Natale2017] in push recovery scenarios. Finally, Sec. VI concludes the paper.

Ii Background

Ii-a Notation

  • and denote the identity and zero matrices;

  • denotes an inertial frame;

  • is the vector from the origin of

    to the origin of expressed with respect to the orientation;

  • given and , , where is the homogeneous transformations and is the rotation matrix;

  • given the hat operator is , where

    is the set of skew-symmetric matrices and

    . is the cross product operator in , in this paper the hat operator is also indicated by ;

  • given the vee operator is ;

  • denotes the angular velocity between the frame and the frame expressed in the frame ;

  • the subscripts , , and indicates the frames attached to the torso, left foot, right foot and CoM;

  • For the sake of clarity, the prescript will be omitted.

Ii-B Humanoid Robot Model

A humanoid robot is modelled as a floating base multi-body system composed of links connected by

joints with one degree of freedom. Since none of the robot frames has an a priori pose with respect to (w.r.t.) the inertial frame

, the robot configuration is completely defined by considering both the joint positions and the homogeneous transformation from the inertial frame to the robot frame (i.e. called base frame ). In details, the configuration of the robot can be uniquely determined by the a triplet . The velocity of the floating system is represented by the triplet . Given a frame attached to a link of the floating base system, its position and orientation w.r.t. the inertial frame is uniquely identified by an homogeneous transformation, .

Similarly, the frame velocity w.r.t. the inertial frame is uniquely identified by the vector . The function that maps to the twist is linear and its matrix representation is the well known Jacobian matrix :


For a floating based system, the Jacobian can be split into two submatrices, one multiplies the base velocity while the other the joint velocities. From (1), one has:


The dynamics of the floating base system can be described by the Euler-Poincaré equation [Marsden2010]:


where represents the mass matrix, is the Coriolis, the centrifugal term and the gravitational term. On the right-hand side, is a selector matrix, is the vector containing the joint torques and is a vector containing the coordinates of the contact wrench. indicates the number of contact wrenches. Henceforth, we assume that at least one of the links is in contact with the environment, i.e. .

The centroidal momentum of the robot is the aggregate linear and angular momentum of each link of the robot referred to the robot center of mass (CoM). The centroidal momentum of the robot is related to the robot velocity through the centroidal momentum matrix as [orin2008centroidal]


The centroidal momentum time derivative is given by:


where is called centroidal bias acceleration. It is worth to recall that the rate of change of the centroidal momentum of the robots can be also expressed by using the external contact wrenches acting on the system as [nava16]


Ii-C Simplified models

The motion of the robot can be viewed considering the linear momentum dynamics only


where represents the position of the CoM and the linear component of the external force acting on the system and the rate of change of the linear centroidal momentum.

Now, the DCM is given by [Englsberger2015]:


where , in case of constant CoM height , is the pendulum time constant, i.e. . Using (7) and (8), the DCM time derivative is given by:


By introducing the virtual repellent point (VRP) as


Eq. (9) can be simplified as:

Fig. 2: The control architecture is composed of three layers: the trajectory optimization, the simplified model control, and the whole-body QP control.

Iii Torque-control DCM based architecture

This section summarizes a torque-control DCM based architecture for humanoid robots walking without the push recovery feature – see Fig. 2. The push recovery feature is added using the algorithm presented in Sec. IV.

The control architecture is composed of three main layers. The first is the trajectory optimization, whose main purpose is to generate desired footstep positions and orientation and also the desired DCM trajectory using the robot state. The second layer employs simplified robot models to track the desired DCM trajectory. The third layer is given by the whole-body QP block. It has the purpose of ensuring the tracking of the desired feet positions and orientations and also the CoM acceleration by generating joint torques.

Iii-a Trajectory Optimization layer

The objective of the trajectory optimization layer is to evaluate the desired feet and DCM trajectories. In this work, we divide the trajectory optimization problem into two sub-tasks. The former aims to generate the nominal footprints and DCM trajectory without taking into account the current status of the robot. Then the step adjuster module adjusts the aforementioned quantities by considering the error between the nominal and current DCM of the robot.

In the trajectory optimization layer, we assume that the humanoid robot walks with a constant height between the CoM and the stance foot. This assumption allows us to simplify the DCM planning problem by considering only the projection of the DCM on the walking surface, while the height of the DCM remains constant and equal to the CoM height. Furthermore, by assuming a constant centroidal angular momentum, the projection of the VRP on the walking surface coincides with the ZMP [Englsberger2014]. As a consequence, in this section we consider only the DCM planar dynamics.

Before generating the desired nominal trajectories, the nominal footstep positions have to be planned. For this, the humanoid robot is approximated as a unicycle, and the feet are represented by the unicycle wheels [8594277]. By sampling the continuous unicycle trajectory, it is possible to associate each unicycle pose to a time instant . This time instant can be considered as the time in which the swing foot impacts the ground. Once the impact time is defined, we decide to use it as a conditional variable to plan robot feasible footsteps, i.e. too fast/slow step duration and too long/short step length are avoided. Once the footsteps are planned, the feet trajectory is obtained by cubic spline interpolation.

The nominal footsteps position is also used to plan the nominal DCM trajectory. In details, the DCM is chosen so as to satisfy the following time evolution:


where indicates the -th steps, is the initial position of the DCM, is placed on the center of the stance foot and has to belong to the step domain . Assuming that the final position of the DCM coincides with the ZMP at last step, (12) can be used to to define the following recursive algorithm for evaluating the DCM trajectory [Englsberger2014]:


As already pointed out in [Englsberger2014], the presented DCM planner has the main limitation of taking into account single support phases only. Indeed, by considering instantaneous transitions between two consecutive single support phases, the ZMP reference is discontinuous. This leads to the discontinuity of the external contact wrenches and consequentially of the desired joint torques. The development of a DCM trajectory generator that handles non-instantaneous transitions between two single support phases becomes pivotal [Englsberger2014, Englsberger2019]. In the following, we decide to implement the solution proposed by Englsberger in [Englsberger2014]. In order to guarantee a continuous ZMP trajectory, the desired DCM trajectory must belong at least to class. This can be easily guaranteed by smoothing two consecutive single support DCM trajectory by means of a third order polynomial function whose coefficients are chosen in order to satisfy the boundaries conditions, i.e. initial and final DCM position and velocities.

Iii-B Simplified model control

The main goal of the simplified model control layer is to implement a control law based on the simplified model of the humanoid robot (see Section II-C) to stabilize the unstable DCM dynamics (11).

To guarantee the tracking of the desired DCM trajectory the following control law is chosen:


By using the control input (14) in (11), one has:


The choice of guarantees an asymptotically stable close loop error dynamics.

Iii-C Whole-body QP control

The whole-body QP control layer aims at the tracking of kinematic and force quantities. The proposed whole-body controller computes the desired joint torques using the robot dynamics (3), where the robot acceleration and the contact wrenches are set to desired starred quantities, i.e.:


What follows evaluates desired generalized robot acceleration and contact wrenches that are compatible with the constraints acting on the system. Then, the torques (16) do not break the contacts the robot makes with the environment.

Iii-C1 Desired generalized robot acceleration

the desired generalized robot acceleration are chosen to track the CoM acceleration, the torso orientation and the left and right feet position and orientation. To do so, we use a stack of task approach, which plays the role of a prioritized inverse kinematics in the acceleration space. The tracking of the feet trajectory and the CoM acceleration are considered as high priority tasks, while the torso orientation as well as a postural task are considered as low priority tasks. The control objective is achieved by designing the controller as a constrained optimization problem with a cost function as:


where indicates the weighted norm of the vector . The tracking of the desired torso orientation is attempted via the first term on the right hand side of (17), with the desired torso angular acceleration . By choosing this desired acceleration properly, it is possible to guarantee almost-global stability and convergence of to [Olfati-Saber2000], i.e.


where , and are positive numbers. The postural task is encoded in (17) thanks to the second term, which tends to penalise high joint error by imposing


where and are positive definite matrices. Concerning the tracking of the feet and the CoM, we have:


When the foot is in contact, the desired acceleration is zero. During the swing phase, the angular part of is given by (18) where the subscript is substitute with , while the linear part is equal to:


Here the gains are again positive definite matrices. The CoM task is achieved by asking for a desired acceleration


where the desired VRP is computed by the simplified model control layer – see (14). It is also worth noting that when the CoM is considered in (20), the term is the linear part of the centroidal momentum matrix scaled by the robot mass (4). The above hierarchical control objectives can be summarized into a whole-body optimization problem:


The decision variable is the robot acceleration . Since the acceleration depends linearly on , the optimization problem can be cast into a QP one and solved efficiently.

Iii-C2 Desired contact wrenches

the desired contact wrenches are computed to track the desired centroidal momentum. To do so, we developed an optimization problem where the unknown variable is a stacked vector of contact forces . More precisely, the main objective of the optimization problem is to minimize the following cost function:


The tracking of the desired centroidal momentum rate of change is attempted thanks to the first term of (24). The desired linear momentum rate of change is computed from the desired VRP – see (14) as


On the other hand, the desired rate of change of the angular momentum is computed by using the Eq. (5), where is substitute with the desired robot acceleration evaluated by solving the optimization problem (III-C1). The last term of  (24) is used to minimize the required contact wrenches. The aforementioned optimization problem can be summarized as:


The decision variable is the contact wrenches . Since the rate of change of the centroidal momentum depends linearly on , the problem (26a) can be cast into a QP one.

Once the desired robot acceleration and the desired contact wrenches are computed, the desired joint torques can be easily evaluated by using the system dynamics (16).

Iv Step Adapter

Fig. 3: The trajectory optimization block in Fig. 2 is composed of three sub-modules: the footstep planner, the DCM planner, and the Step Adapter.

This section complements the architecture presented in Sec. III – and shown in Fig. 2 – with a push recovery feature. As sketched in Fig. 3, the trajectory optimisation layer is augmented with a block called step adapter to provide the architecture with the push recovery feature.

More precisely, assuming the robot on a single support state, we propose below a step adjustment strategy that optimises the next step position and timing while attempting to follow a nominal DCM trajectory. In particular, we define an exponential interpolation for the ZMP trajectory as:


where are chosen to satisfy the ZMP boundary conditions, i.e. , i.e.:


with the duration of the single support phase, and


By substituting (28) into the DCM dynamics (11

), the following ordinary differential equation (ODE) holds:


The solution to (31) writes:


where is the vector of unknown coefficients that can be found by imposing the boundary conditions. Therefore, we can find these coefficients by solving the problem (32) either as an initial value problem, namely


or as a final value problem:


To find a DCM trajectory that satisfies both the initial and the final condition problems, the coefficient must equal . Thus, by combining (33) and (34), one has:


Now, define . Then, in view of (30), substituting (29) to (35) yields:


Let denote the ZMP position at the beginning of next step, and the DCM offset. Therefore, straightforward calculations lead to:


The step adjustment problem can be formalized as a constrained optimization problem where the search variables are , and , and the cost function be properly defined to follow nominal trajectory values. It is worth noting that the desired final DCM position and step timing depend on and , respectively. Meanwhile, is assumed to be at the center of the foot at the beginning of next step. Thus, we can assume this position to be considered as a target for the next footstep position.

The following cost function is chosen to yield the desired gait values as close as possible to the nominal ones:


where , , are positive numbers and the next ZMP position , step duration and next DCM offset are obtained from the nominal trajectory generator.

We also introduce the following inequality constraints:


where and . The inequality constraints are defined based on the maximum step length and minimum step duration that are related to the leg kinematic constraints and maximum realizable velocity, respectively. Finally, the relation described in (37) is treated as an equality constraint. It is worth to notice that the linearity of (37) is ensured by the specific choice of (28), and this choice also guarantees the boundedness of the ZMP trajectory. Since the cost function and the constraints depend quadratically and linearly on the unknown variables, all the above can be casted as a QP problem. The QP problem is solved at each control cycle by substituting with the current DCM position and by progressively shrinking the single support duration as the robot completes the step.

Once the footstep position and the step timing are evaluated as solution to the QP problem, a new DCM trajectory is generated with the algorithm presented in section III-A.

Fig. 3 sketches the overall architecture for step recovery. The foot step planner generates the nominal foot step position and timing, which are inputs of the DCM planner. The DCM planner generates the nominal DCM trajectory and foot position. The step adapter module combines nominal values (foot and DCM trajectories) and measurements (actual DCM) to evaluate the adapted feet trajectories, footstep position and timing. The former are as input of the whole-body QP control layer, while the latter are retrieved by the DCM planner to evaluate the adapted DCM trajectory.

V Results

In this section, we test the step adjustment approach alongside the architecture presented in Sec III. Although the methodology presented above encompasses time-varying ZMP trajectories, the tests presented in this section are for fixed ZMPs during the single support phase.

More precisely, we present simulation results obtained with the implementation of the control architecture shown in Fig. 2 3. We simulate the iCub humanoid robot [Metta2010] using Gazebo simulator [gazeboSim]. Let us recall that iCub is tall humanoid robot. Its weight is and it has a foot length and width of and , respectively.

The architecture takes (in average) less than for evaluating its outputs: qpOASES [Ferreau2014] was used to solve the QP problems. The control modules run on the iCub head’s computer, i.e. a 4-th generation Intel ® Core i7 @ .

To validate the performances of the proposed architecture, we present two main push recovery experiments. First, the robot walks straight and an external disturbance acts on the pelvis to the lateral direction – red arrows in Fig 3(a). Secondly, the robot follows a circular path and an external push is exerted on the pelvis as shown in Fig. 3(b). In both scenarios, the maximum straight velocity is , and the external force has a magnitude of and lasts . The performances of the control architecture are shown in the accompanying video111Video link:

V-a Straight walking scenario

Fig. 5 depicts the feet and DCM trajectory. The instant at which the disturbance acts on the robot is indicated by green vertical lines. The step adapter compensates the disturbance effect by step timing and position optimization. The controller extends the step width with an average of , as shown in the Fig. 3(a). It reduces the step timing with an average of with respect to the nominal value of and, as depicted in the Fig. 6, the foot trajectory is adapted and step timing is decreased and the foot touches the ground sooner with respect to the nominal values.

Fig. 4: Foot prints for straight (a) and circle path (b) push recovery scenario. The push is indicated by red arrows.
Fig. 5: Trajectory of feet and DCM for lateral push recovery scenario. The instance that push exerts on the robot is indicated by green vertical lines.

Fig. 3(a) shows in green and orange the position of the adapted footstep and the nominal one, respectively. If no external disturbances act on the robot, the position of the adjusted footsteps coincides with the nominal one, hand the footstep is represented by a gray rectangle.

V-B Circular path following scenario

In this scenario, the step adaptation algorithm compensates the external disturbance by extending the step length with an average of , and is applied for circular path walking.

In Fig. 3(b), the four disturbances exerted on the robot are indicated with red arrows. The nominal footprints corresponding to those instances are shown in orange. All the adjusted footprints are shown in green. When the there is no disturbance acting on the robot, the footprints generated by the step adjustment algorithm coincides with the nominal one, thus are represented by using a gray rectangle. Since the push has components in both and coordinates, it activates a step adjustment both in sagittal and lateral directions.

Vi Conclusion and Future Work

This paper contributes towards robust bipedal locomotion of humanoid robots against external disturbances. The proposed approach fits control architecture where the DCM is planned beforehand, and allows a time-varying ZMP trajectory during the single support phase. Thus, applications of the proposed approach enable the heel to toe motions. The step adapter presented in this paper optimises step position, timing, and DCM trajectory, all cast in a single QP problem. The proposed algorithm is employed in a three-layer DCM based controller architecture and validated on the simulated version of the iCub humanoid robot. During the tests, the robot undergoes perturbations in the form of external forces with magnitude of , applied continuously for . Because of such disturbances, the presented algorithm determines suitable modifications to the pre-planned trajectories allowing to maintain balance. As a future work, we plan to validate this architecture also on the real robot.

The efficiency of the presented algorithm relies on simplified DCM models. The cost of such choice is that the balancing capabilities of the robot may not be fully exploited, abd larger perturbations may require excessive modifications to the nominal trajectories. Hence, future work may consider also the robot angular momentum and its full kinematics. Another possible improvement would consist in adopting the push recovery strategy also during the double support phase.

Fig. 6: The z trajectory of the swing foot during step adjustment