Control of humanoid robots is challenging due to limiting constraints on contact forces, and nonlinear switching dynamics. Furthermore, guaranteeing safety for humanoids is critical, as collision with the environment or falling down can cause severe damage to the robot and its surroundings. Linear MPC  is a powerful tool for designing real-time feedback controllers subject to state and input constraints, which makes it a prime candidate for generating a wide range of feasible reference walking motions for humanoid robots [25, 9, 22]. However, the theoretical guarantees associated with MPC (e.g., constraint satisfaction guarantees) can easily be lost due to external disturbances or the discrepancy between the nonlinear dynamics of the robot and the linearized model used in control.
Recently, [3, 7] studied how to account for the bounded error in constraint satisfaction due to the approximation of the nonlinear center of mass (CoM) dynamics, and  investigated nonlinear constraints due to step timing adaptation. The major drawbacks in these approaches are: they do not account for the closed-loop tracking errors due to disturbances, there are no robustness guarantees of constraints satisfaction in the presence of different disturbances, which is critical for generating safe walking motions.
Linear Robust MPC (RMPC) schemes have been extensively studied in the control literature [14, 5, 16]. Recently,  used the well-known tube-based RMPC approach originally developed in  for generating robust walking motions for humanoid robots, taking into account the effects of additive compact polytopic uncertainties on the dynamics. Using a state feedback control policy and a pre-stabilizing choice of static dead-beat gains, they showed that constraints are guaranteed to be satisfied for all disturbance realizations inside the disturbance set. A drawback of RMPC is that the constraints are designed to accommodate for the worst-case disturbance, which is quite conservative and sacrifices performance (optimality) to guarantee hard constraints satisfaction.
exploits the underlying probability distribution of the disturbance realizations. Furthermore, SMPC offers a flexible framework by accounting forchance constraints, where constraints are expected to be satisfied within a desired probability level. Depending on how critical the task is, the user can tune the desired probability level between the two extremes of almost hard constraint satisfaction (as in RMPC) and complete negligence of disturbances (as in nominal MPC). This flexibility becomes very practical, since a humanoid robot needs to move in dynamic environments where some of the constraints can be more critical than others. For example, moving through a narrow doorway or walking in a crowd , the robot needs to reduce the sway motion of its CoM to reduce the probability of collision. However, for walking on challenging terrains with partial footholds , the robot has to bring the foot center of pressure (CoP) as close as possible to the center of the contact area. Many other tasks can be considered somewhere between those situations. To this end, SMPC can be a powerful and systematic tool for dealing with constraint satisfaction in different environments and tasks. Moreover, small errors are typically more likely to occur in practice. It might therefore be more appropriate to explicitly consider the distribution of disturbances instead of treating all of them equally as in RMPC, which often lead to very conservative behavior.
In this letter, we revisit the problem of generating reference walking motions for humanoid robots using an linear inverted pendulum model (LIPM) subject to additive uncertainties on the model. Our contributions are the following:
We introduce linear SMPC to generate stable walking, taking into account stochastic model uncertainty subject to individual chance constraints.
We analyze the robustness of SMPC to worst-case disturbances, drawing an interesting connection between robust and stochastic MPC, and highlighting their fundamental difference.
We compare SMPC, RMPC, and nominal MPC in terms of robustness (constraints satisfaction) and performance, empirically showing that, under bounded disturbances (which is the case in practice) SMPC can achieve hard constraint satisfaction, while being significantly less conservative than RMPC.
represents a variable at time with denoting the predicted value of the variable at the future time step
refers to the Minkowski set sum
refers to the Pontryagin set difference
following a distribution is denoted as , with being the expected value of
Ii-B Linear model of walking robots
The dynamics of the CoM of a walking robot, under the assumption of rigid contacts with a flat ground, can be modelled as follows :
where denotes the CoM position in the lateral directions of motion . The total mass of the robot is denoted by , the matrix is a rotation matrix, with the center of pressure (CoP) being constrained inside the convex hull of the contact points
Under the assumption of constant CoM height and constant angular momentum , the dynamics (1) can be simplified to the well-known Linear Inverted Pendulum Model (LIPM), resulting in the following linear relationship between the CoM and the CoP
where represents the system’s natural frequency, and
being the norm of the gravity vector along. From now on, we will drop superscripts for convenience.
Ii-C Nominal linear MPC for bipedal locomotion
Consider the discrete-LTI dynamics (3) subject to state and control constraints:
where the state and control input .
represents the set of linear kinematic constraints of the robot, like self collision, maximum stride length, etc. MPC deals with solving the optimal control problem (OCP) at every sampling timeas follows:
denotes the control sequence along the prediction horizon and is the minimizer of (7a) given the current initial condition . The above MPC scheme applies only the first control action of the optimal open-loop control sequence. We avoided using terminal constraints (e.g capturability ) in our comparison, since to the best of our knowledge there is no systematic way for handling terminal constraints in SMPC as in nominal MPC and RMPC. One of the options for generating viable reference waking trajectories using the above MPC scheme without terminal constraints is to minimize one of the CoM derivatives, adding it to the cost function . With a sufficiently long a valid choice of the cost function in (7a) can be
, and represent desired walking direction and velocity of the robot respectively. denotes the desired CoP tracking position, which is usually chosen to be at the center of for robustness. and are user-defined weights.
Iii Tube-based Robust MPC (RMPC)
Two Tube-based linear RMPC versions were first introduced in  and . We follow the approach of  as it has been more commonly used in the control community, and recently in  for bipedal locomotion. Note however that our qualitative results and comparison with SMPC would still hold for .
Iii-a Robust OCP formulation and control objective
Consider the following discrete-LTI prediction model subject to additive stochastic disturbance :
(Bounded disturbance) for is a disturbance realization, with denoting a polytopic compact (closed and bounded) disturbance set containing the origin in its interior.
Consider the nominal state evolving as
under the control action . The main control objective of Tube-based RMPC is to bound the evolution of the closed-loop state error using an auxiliary state feedback control law
with being Schur (eigen values inside unit circle). The propagation of the closed-loop error dynamics (19) converges to the bounded set
Hence the limit set of all disturbed state trajectories lie within a neighborhood of the nominal trajectory known as a tube of trajectories. It is clear that if , and the tube of trajectories collapses to a single trajectory, which is the solution of (17). In set theory, is called the minimal Robust Positive Invariant (mRPI) set, or Infinite Reachable Set. We recall some standard properties of disturbance invariant sets that will be used to design tightened sets of state and control constraints in the next subsection.
i.e. if . In simple words, once the error is driven to it will remain inside for all future time steps if subject to the bounded disturbance .
An outer-approximation of the mRPI set can be computed using the well-known approach of . The size of depends on the system’s eigen values, the choice of , and .
Iii-B State and control back-off design
Using the mRPI set , and the stabilizing feedback gains , the state and control constraint sets are tightened as
The new tightened state and control constraint sets are often called backed-off constraints. Satisfying the backed-off constraints (22a)-(23a) using the control law (18), ensures the satisfaction of (15b)-(16b).
Following the choice of dead-beat pre-stabilizing feedback gains proposed in , we get , which allows us to compute exactly (whereas usually this needs to be approximated using numerical techniques). The dead-beat gains are also a practical choice, since they lead to the smallest control back-off .
Iii-C Tube-based RMPC algorithm
The tube-based RMPC scheme solves the OCP in (14a) by splitting it into two layers;
MPC layer: computes feasible feedfoward reference control actions every MPC sampling time subject to the backed-off state and control constraints as follows
(24a) (25a) (26a) (27a) (28a) (29a)
The above tube-based RMPC algorithm is often called open-loop (OL) MPC, since the initial state is not the current state of the system . It is guaranteed to be recursively feasible (i.e. if the OCP problem is feasible at , it will remain feasible for all future time steps). In  and recently in , the current state of the system is used which is referred to as closed-loop (CL) MPC. CL-MPC might lead to infeasibility of the OCP. The problem of recursive feasibility can be tackled by optimizing for the initial state online such that its difference with the nominal state is projected into the mRPI set, i.e. . In , this approach is named robust closed-loop (RCL) MPC.
Iv Stochastic MPC With State And Control Chance Constraints (SMPC)
The main objectives of SMPC are to deal with computationally tractable stochastic uncertainty propagation for cost function evaluation, and to account for chance constraints, where constraints are expected to be satisfied within a desired probability level. With an abuse of notation, we will use some of the notations defined in Section III in a stochastic setting.
Iv-a Stochastic (OCP) formulation and control objectives
Consider the following discrete-LTI prediction model subject to additive stochastic disturbance :
for is a disturbance realization of identically independent distributed (i.i.d.), zero mean random variables with normal distribution
is a disturbance realization of identically independent distributed (i.i.d.), zero mean random variables with normal distribution. The disturbance covariance is a diagonal matrix, with .
Eq. (31a)/(32a) denote individual point-wise (i.e. independent at each point in time) linear state/control chance constraints with a maximum probability of constraint violation /. Since the disturbed state in (30a) is now a stochastic variable, it is common to split its dynamics into two terms: a deterministic term ; and a zero-mean stochastic error term , which evolve as
Notice that in contrast to the closed-loop error evolution in RMPC (19), the propagation of the predicted error in SMPC is independent of due to the assumption of zero initial error, which implies a closed-loop approach. The evolution of the state covariance
is a fixed stabilizing dead-beat feedback gains (see remark 1) for (30a), and is the decision variable of the MPC program. In what follows, we present a deterministic reformulation of the individual chance constraints (31a) - (32a) that will be used in the SMPC algorithm.
Iv-B Chance constraints back-off design
We seek the least conservative deterministic upper bound such that by imposing
we can guarantee that (37) be satisfied. The smallest bound can then be obtained by solving linear independent chance-constrained optimization problems offline:
Using the disturbance assumption (2), one can solve such programs easily since there exist a numerical approximation of the cumulative density function (CDF) for normal distribution. Hence, can be computed using the inverse of the CDF of the random variable . Contrary to RMPC, the state back-offs grow contractively along the horizon, taking into account the predicted evolution of the error covariance. Similarly, we reformulate the individual control chance constraints in (32a) using (33a)-(34a), and the control law (36):
The individual control constraints back-off magnitudes can be computed along the horizon using the inverse CDF of the random variable .
Iv-C SMPC with chance constraints algorithm
The SMPC scheme with individual chance constraints computes feasible reference control actions at every MPC sampling time subject to individual state and control backed-off constraints as follows
Contrary to the RMPC algorithm discussed in the previous section, here we do not employ the linear feedback control law (36) because this SMPC algorithm works in closed-loop. The linear feedback policy (36
) is only used to predict the variance of the future error.
Contrary to RMPC, the above SMPC algorithm is not guaranteed to be recursively feasible due to the fact that the disturbance realization is unbounded. To tackle this practically, disturbance realizations are assumed to have a bounded support . There have been recent efforts on recursive feasibility for SMPC using different ingredients of cost functions, constraint tightening and terminal constraints as in  . However, recursive feasibility guarantees for SMPC is out of this paper’s scope.
|CoM height ()||()|
|gravity acceleration ()||()|
|foot support polygon along direction ()||()|
|foot support polygon along direction ()||()|
|bounded disturbance on CoM position ()||()|
|bounded disturbance on CoM velocity ()||()|
|disturbance std-dev of CoM position ()||()|
|disturbance std-dev of CoM velocity ()||()|
|MPC sampling time ()||()|
|MPC receding horizon ()|
V Worst-case Robustness of SMPC
SMPC ensures constraint satisfaction with a certain probability, while RMPC ensures it under bounded disturbances. When comparing the two approaches, one could think that SMPC is equivalent to bounding stochastic disturbances inside a confidence set and then applying RMPC. This section clarifies that this is not the case. In particular, we answer the following question: when using SMPC, what are the bounded disturbance sets under which we can still guarantee constraint satisfaction? Considering a single inequality constraint and hyper-rectangle disturbance sets, we show how to compute the size of such sets, and that they shrink along the control horizon. Since disturbance set is instead fixed in RMPC, we conclude that the two approaches are fundamentally different.
Consider a chance constraint (we drop the subscripts for convenience). Given the corresponding back-off magnitude (IV-B), we seek the maximum hyper-rectangle disturbance set such that the constraint is satisfied for any :
This problem has a simple solution
where and is the element-wise absolute norm. From the SMPC derivation we know that is computed via the inverse CDF of
, which returns a value proportional to its standard deviation. Therefore we can write
Solving for has infinitely many solutions. However, we can get a unique solution by imposing a ratio between and as follows:
Substituting back in (49) and solving for we get:
Since the sum of the squares of positive values (the numerator of ) is always smaller than the square of the sum (the denominator of ), we have that . Moreover, since both series are convergent, . This shows that, as grows, decreases, and so does the disturbance set . We conclude that, when using SMPC, the disturbance sets under which we have guaranteed constraint satisfaction shrink along the control horizon.
Vi Simulations Results
In this section, we present simulation results of generated walking motions for a biped robot subject to additive persistent disturbances on the dynamics. We compare the motions generated using SMPC subject to state and control chance constraints against nominal MPC and tube-based RMPC. Without loss of generality, we apply the disturbances on the lateral direction of motion, and constraint the lateral CoM position with a half space constraint , which accounts for a collision that the robot needs to avoid. However, the same machinery is applicable to any half-space linear constraint. We aim to compare robustness w.r.t. constraint violations and performance of SMPC against tube-based RMPC and nominal MPC subject to different disturbance realizations. The robot model and disturbance parameters are defined in Table (I).
Vi-a Nominal constraint satisfaction in nominal MPC vs chance constraint satisfaction in SMPC
We compare the number of state constraint violations using nominal MPC (7a) against SMPC with in Fig. 1. We fix , which is analogous to satisfying the nominal CoP constraints (zero control back-off magnitude). We simulate trajectories for nominal and SMPC and randomly apply the same sampled Gaussian disturbance realizations with bounded support along both sets of trajectories, where . In Fig. 1, we plot the number of constraint violations at , where the motion is close to the constraint. As expected, SMPC satisfies the designed (i.e. we can expect at most constraint violations out of trajectories at each point in time), while nominal MPC violated the constraint up to 73 times.
Vi-B Hard constraints satisfaction in tube-based RMPC
First, we compute offline the state and control back-off magnitudes to tighten the constraint sets for tube-based RMPC.
The state constraints back-off magnitude is computed using an outer approximation of the mRPI set using the procedure in , with an accuracy of . In Fig. 2, we test the positive invariance property (1) of , by simulating initial conditions starting at the set vertices for time steps, and applying randomly sampled disturbance realizations from the disturbance set . As shown, the evolution of each initial condition (red dots), is kept inside (the tube section) when subject to disturbance realizations . Using the same choice of pre-stabilizing dead-beat gains as in , the robust control back-off magnitude is computed exactly without resorting to numerical approximation . In Fig. 3, we plot the CoM position and CoP of trajectories obtained using tube-based RMPC. In the first two steps, no disturbances were applied showing that the CoM position trajectories backs off conservatively from the CoM constraint with the magnitude of the mRPI set on the CoM position . In step three and four, we randomly apply sampled Gaussian disturbance realizations with finite support showing that both the state and control constraints are satisfied. At the final three walking steps, we apply the worst-case disturbance on the CoM position in the direction of the CoM constraint , showing that the state constraint is saturated as expected. This shows that tube-based RMPC anticipates for the worst-case disturbance all the time to guarantee a hard constraint satisfaction, which is quite conservative and sub-optimal when smaller disturbances occur as shown in the first four walking steps.
Vi-C Chance-constraints satisfaction in SMPC vs RMPC
This subsection presents the results of SMPC. Contrary to RMPC, the state and control back-off magnitudes (, ) vary along the horizon, and are computed based on the propagation of the predicted state covariance (35), pre-stabilizing feedback gain , disturbance covariance , and the desired probability level of individual state and control constraint violation and respectively. We set , and , which corresponds to satisfying the nominal CoP constraints.
Using the same choice of stabilizing feedback gains as in RMPC, we simulate trajectories using SMPC in Fig. 4. In the first two steps, we apply no disturbances to the trajectories, showing no violations on the CoM position constraint, and the CoM trajectories back-off with less magnitude compared to RMPC. For the rest of the motion, we randomly apply sampled Gaussian disturbance realizations with finite support .
In Fig. 4, we zoom around the CoM constraint to show the empirical number of CoM position constraint violations out of the simulated trajectories. As shown in the fourth step, () constraint violation occurred, and () violation at the sixth step, which is kept within the designed probability level of CoM constraint violations . We note that for this walking motion varying does not have an evident affect on the CoP motion. This is due the cost on the distance between the CoP and the center of the foot.
In Fig. 5, we compare SMPC with against tube-based RMPC by simulating
trajectories and randomly applying disturbance realizations sampled from a Gaussian distributionwith finite support along the trajectories. With a sufficiently low (i.e. ), SMPC was able to obtain zero constraint violations, as RMPC. In other words, SMPC can offer hard constraint satisfaction as RMPC, but using less conservative back-off magnitudes than RMPC, which sacrifices optimality for hard constraints guarantees. Note that, thanks to our simple choice of and , we can compute the maximum value of for which we can get hard constraint satisfaction as: , where is the CDF of the Normal distribution. This value is coherent with our results.
To test robustness of constraint satisfaction and optimality of SMPC, we ran an empirical study of the same two step walking motion ( trajectories) comparing SMPC with varying and fixed against tube-based RMPC and nominal MPC in Fig. 6. We plot the empirical number of CoM position constraint violation at s against the averaged cost performance (of trajectories) ratio between different MPC schemes and nominal MPC as the measure of optimal cost. As before, disturbance realizations are sampled from with finite support . As expected, the higher the probability level of constraint satisfaction in SMPC, the lower the amount of constraint violations (higher robustness). The highest number of constraint violations is obtained at , which is equivalent to nominal MPC. Zero constraint violations were obtained when , as for RMPC. An advantage of SMPC with over RMPC, is the lower average cost. This gives the user the flexibility to design the controller for different task constraints, by tuning the probability level of constraint satisfaction without sacrificing performance as in tube-based RMPC or sacrificing robustness as in nominal MPC.
Vii Discussion and Conclusions
This paper compared the use of SMPC with RMPC to account for uncertainties in bipedal locomotion. Many SMPC and RMPC algorithms exist. We decided to focus on two particular instances of tube-based approaches, which have the same online computational complexity as nominal MPC. Indeed, all the extra computation takes place offline, and consists in the design of tightened constraints (back-offs) based on a fixed pre-stabilizing feedback gain .
Our comparison focused on the trade off between robustness and optimality. Our tests show that, if disturbances are bounded and we set a sufficiently small probability of constraint violation ( in our tests), SMPC can achieve 100% constraint satisfaction like RMPC, but with less conservative control, i.e. it results in better performance as measured by the cost function. This is reasonable because RMPC behaves conservatively, expecting a persistent worst-case disturbance, which in practice is extremely unlikely to happen. SMPC instead reasons about the probability of disturbances. In Section (V) we showed that we can compute the maximum disturbance sets to which SMPC ensures robustness. These sets shrink as time grows, highlighting the fact that getting persistently large disturbances gets less likely with time. Loosely, SMPC can be thought as a special kind of RMPC that considers shrinking disturbance sets along the horizon.
Our empirical results are specific to the choice of dead-beat feedback gains used in both algorithms. These gains were computed in  by minimizing the back-off magnitude on the CoP constraints. This is sensible because the CoP is usually more constrained than the CoM in bipedal locomotion. Other feedback gains could be used, such as LQR gains, resulting in back-off magnitudes that are a trade-off between state and control constraints. While changing the gains would affect our quantitative results, it would not affect the qualitative differences between SMPC and RMPC that we highlighted in the paper.
In conclusion, SMPC offers an opportunity for the control of walking robots that affords trading-off robustness to uncertainty and performance. Future work will address the problem of recursive feasibility and closed-loop constraint satisfaction . Moreover, we intend to investigate nonlinear versions of RMPC and SMPC , to enable the use of more complex models of locomotion.
-  (2008) Set-theoretic methods in control. Springer. Cited by: Property 1.
-  (2017) Adaptive step duration in biped walking: a robust approach to nonlinear constraints. In 2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids), pp. 724–729. Cited by: §I.
-  (2015) A robust linear mpc approach to online generation of 3d biped walking motion. In 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids), pp. 595–601. Cited by: §I.
-  (2016) Model predictive control. University of Oxford, Hilary Term. Cited by: §I.
-  (2001) Systems with persistent disturbances: predictive control with restricted constraints. Automatica 37 (7), pp. 1019 – 1028. External Links: Cited by: §I, §III.
-  (2019) Effect of planning period on mpc-based navigation for a biped robot in a crowd. Cited by: §I.
-  (2016) Planning robust walking motion on uneven terrain via convex optimization. In 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids), pp. 579–586. Cited by: §I.
-  (2018) Stochastic model predictive control — how does it work?. Computers & Chemical Engineering 114, pp. 158 – 170. Note: FOCAPO/CPC 2017 External Links: Cited by: §I, §IV-A.
-  (2010-04) Online walking motion generation with automatic foot step placement. Advanced Robotics 24, pp. 719–737. External Links: Cited by: §I, §II-C.
-  (2018) Stochastic model predictive control for linear systems using probabilistic reachable sets. In 2018 IEEE Conference on Decision and Control (CDC), pp. 5182–5188. Cited by: §VII.
-  (2019) A nonlinear model predictive control framework using reference generic terminal ingredients. IEEE Transactions on Automatic Control (), pp. 1–1. Cited by: §VII.
-  (2012) Capturability-based analysis and control of legged locomotion, part 1: theory and application to three simple gait models. The international journal of robotics research 31 (9), pp. 1094–1113. Cited by: §II-C.
-  (2017-07) Constraint-tightening and stability in stochastic model predictive control. IEEE Transactions on Automatic Control 62 (7), pp. 3165–3177. External Links: Cited by: §I, §IV-A, Remark 3.
-  (2001) Robustifying model predictive control of constrained linear systems. Electronics Letters 37 (23), pp. 1422–1423. Cited by: §I, §III, Remark 2.
-  (2000) Constrained model predictive control: stability and optimality. Automatica 36 (6), pp. 789 – 814. External Links: Cited by: §I.
-  (2005) Robust model predictive control of constrained linear systems with bounded disturbances. Automatica 41 (2), pp. 219 – 224. External Links: Cited by: §I, Remark 2.
-  (2015) Robust and stochastic mpc: are we going in the right direction?. IFAC-PapersOnLine 48 (23), pp. 1 – 8. Note: 5th IFAC Conference on Nonlinear Model Predictive Control NMPC 2015 External Links: Cited by: Remark 3.
-  (2019) Mixed stochastic-deterministic tube mpc for offset-free tracking in the presence of plant-model mismatch. Journal of Process Control 83, pp. 102 – 120. External Links: Cited by: Remark 3.
-  (2005-03) Invariant approximations of the minimal robust positively invariant set. IEEE Transactions on Automatic Control 50 (3), pp. 406–410. External Links: Cited by: §III-A, §VI-B.
-  (2017-01) Model predictive control: theory, computation, and design. Cited by: §I.
-  (2019) A constraint-tightening approach to nonlinear model predictive control with chance constraints for stochastic systems. In 2019 American Control Conference (ACC), Vol. , pp. 1641–1647. Cited by: §VII.
-  (2014) Whole body motion controller with long-term balance constraints. In 2014 IEEE-RAS International Conference on Humanoid Robots, pp. 444–450. Cited by: §I.
-  (2016) Stochastic linear model predictive control with chance constraints – a review. Journal of Process Control 44, pp. 53 – 67. External Links: Cited by: §I.
-  (2017) Model predictive control of biped walking with bounded uncertainties. In 2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids), Vol. , pp. 836–841. Cited by: §I, §III, §VI-B, §VII, Remark 1, Remark 2.
-  (2006) Trajectory free linear model predictive control for stable walking in the presence of strong perturbations. (), pp. 137–142. Cited by: §I, §II-C.
-  (2016) Modeling and control of legged robots. In Springer Handbook of Robotics, B. Siciliano and O. Khatib (Eds.), pp. 1203–1234. Cited by: §II-B, §II-C.
-  (2016) Walking on partial footholds including line contacts with the humanoid robot atlas. In 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids), pp. 1312–1319. Cited by: §I.