We propose a new framework for fast kinodynamic locomotion planning for bipedal robots. Our planning algorithm is constructed based on a kinodynamic Rapidly-exploring Randomized Tree (RRT)  with a newly proposed method for approximating kinematic and dynamic constraints, which results in efficient computation, a complete solution and robust feed forward tasks for a Whole Body Controller (WBC)  to control a full body bipedal robot in a cluttered environment.
The foundation of our algorithm is an analytical solution to the Linear Inverted Pendulum Model (LIPM), which is a simplified dynamic model for bipedal robots. The LIPM not only provides a significantly reduced-order dynamically consistent manifold for planning but also generalizes bipedal locomotion so that it is agnostic to the specific robot configuration. In order to plan with LIPM dynamics, we utilize a Phase Space Planner (PSP) . Different from previous walking pattern generators, PSPs uniquely require an inverted pendulum’s desired forward velocity and the sagittal-plane foot placement of the next step. The PSP then generates the dynamically consistent locomotion parameters such as lateral-plane footstep location, the timing to switch to the following step and so on (see Appendix A) to ensure that the model proceeds along the desired heading.
exploited Reinforcement Learning to generate the required sequence. We adapt path planning strategies often used for non-holonomic wheeled robots to collect the required inputs for the PSP. At the same time, we use the wheeled robot’s kinematic limitations as a rough approximation of the dynamic and kinematic constraints on bipedal motion. Additionally, this analogy transforms a high-dimensional discontinuous footstep selection problem into a low-dimensional continuous domain, which allows for the efficient use of RRT or RRT*. We compose the configuration space with the position and heading of the robot rather than using the full joint space like,  or optimizing with the robot’s full configuration like ,  which is typically highly computationally expensive. We introduce the Dubin’s path  as the steering method in our kinodynamic RRT which exactly connects any two states in an obstacle-free configuration space and contributes to the efficient performance of the RRT , .
Lastly, we propose using a new metric for the kinodynamic RRT defined as the temporal duration of the movement, which can be calculated from the LIPM dynamics. Our formulation allows us to calculate the time that the robot will take to traverse any segment of the planned locomotion path. Generally, a metric in an RRT algorithm is used to determine the nearest neighbor on the tree and in the case of kinodynamic RRT, euclidean distance is inadequate as a metric because it does not account for the robot’s dynamics. We believe that using the movement duration as the metric for our locomotion planner is effective because of the following capabilities. Choosing the nearest neighbor now corresponds to finding the neighbor from which you could start a movement and reach the sampled point the quickest. Also, this planner has capability to handle moving obstacles as long as the future obstacle locations are known (or can be well estimated) since collision checking will be evaluated for the future time calculated from the LIPM dynamics.
This capability is rare because the majority of prior work on bipedal locomotion planning divides the planning problem into two sequential phases(1) plan a finite sequence of footstep locations considering kinematic feasibility and obstacle avoidance, and (2) fit a dynamically consistent, continuous motion plan onto that footstep sequence. Because these two phases occur sequentially, there is no information about the dynamics of the robot in the first phase, and thus, no time-to-step information. By time-to-step, we mean the time it will take for the robot to reach a given location in the footstep sequence. For example,  generates an obstacle-free sequence of footstep locations using semi-definite programming while  uses an RRT formulation. But only after the footstep sequence is obtained can they generate the robot’s trajectory considering dynamics, whether that be for each joint individually or for the whole robot’s Center of Mass (CoM). Thus, there is no way to know if a future collision with a moving object will occur because footstep locations are chosen (and collision checked) prior to knowing how long the movement will take.
In our study, we combine LIPM-PSP-kinodynamic RRT with a newly proposed process borrowed from wheeled robots, a steering method and a metric for locomotion planning that yield efficient and fast convergence in computation and can be applied to any kind of bipedal robot. Also, the planning algorithm provides completeness in the solution space and near optimality in terms of time through a rewiring process performed after the primary planning process. The results of this planning algorithm can be sent to a WBC  as tasks to generate whole body locomotion and are tested in a 3D physics-based simulation of the humanoid robot Valkyrie (Fig. 1).
This paper is organized as follows. In Section II, we formulate kinodynamic RRT problem and propose new concepts. In Section III, we describe how the PSP recursively propagates the LIPM dynamic integrated with kinodynamic planner. The results are presented in Section IV. Appendix A includes concept of the PSP and derivation of analytic solution of LIPM dynamics.
Ii Kinodynamic RRT Formulation
To formulate our planning algorithm, we draw on the similarities between bipedal and wheeled robot navigation on a 2D plane because this provides several useful benefits. First, it converts a high-dimensional, discrete footstep planning problem into a low-dimensional, continuous domain with a well defined steering function that can be used for the RRT. Second, we can use the fairly simple limitation of a wheeled robot to conservatively approximate some of the complex kinematic and dynamic limits of the bipedal robot at the planning phase, especially the friction cone and balance limitations which prevent bipedal robots from instantaneously changing their direction of motion while walking.
We construct the kinodynamic RRT with the configuration space composed of the position and heading of the robot and the non-holonomic constraint, adapting the wheeled robot description.
Eq. (1) shows the definition of the configuration space represented in global coordinates () and Eq. (2)–(4) are the kinematic model of the wheeled robot. The solution trajectory should abide by the bounded input Eq. (5) and collision-free constraints Eq. (6). Note that a limitation of this formulation is that the robot is not allowed to turn in place, since we use a constant forward velocity, . If desired, such movements must be planned separately. Two of these wheeled robot parameters are used to approximate the bipedal robot’s reachability and dynamics. Specifically, the constrained input (Eq. (5)) enforces a minimum turning radius for the biped, , and sets a lower bound for the forward CoM velocity.
The classic RRT formulation plans a collision free path through the configuration space by building a tree of possible movements starting from the initial configuration until a branch of the tree can be connected to the goal configuration. A tree, , is comprised of a set of nodes with connections that link the nodes into branches, or sequences. The structure of a tree is such that a node, , can have many children, but only one parent, . Sequences of nodes, or branches, represent potential navigational paths through the configuration space.
We adapted a variant of RRT called RRT-Connect  for use with the wheeled robot described above. We summarize the procedure from a high level as follows: (0) initialize the tree with a single node at the starting configuration, (1) sample a random point in the configuration space, (2) connect each existing node on the tree to the newly sampled point using a constrained steering function, but ignoring obstacles, (3) choose the closest node to the sampled point (commonly called the ”nearest neighbor”) by measuring and comparing each of these path lengths according to some metric, (4) extend the tree by placing new nodes () along the path connecting the ”nearest neighbor” on the tree to the sampled point until a collision occurs or the sampled point is reached, (5) repeat steps (1)–(4), until the goal configuration has been connected to the starting configuration with a valid path. With some frequency, one should substitute the goal configuration for the randomly sampled point in step (1) so that occasional attempts are made to complete the tree. Once the goal configuration has been successfully connected to the tree, the solution is the sequence of nodes which connect the starting configuration to the goal configuration.
In our planner, the concept of a Dubin’s path  is introduced as the steering function to efficiently connect two points in the configuration space (e.g. in Fig. 2(a)) with the shortest navigational path while following the kinematic constraint of the wheeled robot (Eq. (5)) and ignoring obstacles. Among the six families of Dubin’s car solution paths comprised of straight lines and circular arcs with radius , we compare the total length,
in Fig. 2(a), of each of the six candidates and choose the shortest path that connects the two points. Once the minimum is computed for every node in , we define as the set of the closest nodes. For the path connecting each to , we locate the minimum number of kinematically reachable intermediate nodes spaced evenly along the path (Fig. 2(b)) so that the path length between any two sequential intermediate nodes (e.g. and ) is less than or equal to where is a conservative upper bound for the biped’s step length. Thus, for each , we have identified a potential new branch that could be added to , and we will develop a metric which we will use to select the nearest neighbor node and its branch among the 20 branches to append to the tree.
However, a sequence of nodes that trace a path from the initial configuration to the goal configuration does not contain enough information for a bipedal robot to actually walk through these waypoints. Thus, we introduce the PSP (Algorithm 3) which operates on the LIPM to solve for locomotion parameters like the dynamically consistent step location, stance-foot switching time, and the LIPM state after taking the specified step that connects the two nodes. With some information about the initial LIPM state and a sequence of nodes, the PSP can recursively operate on each pair of nodes in the sequence to propagate the LIPM along the path, and generate the information needed to produce dynamically consistent walking motion with a WBC. Additionally, since the PSP calculates the elapsed time between nodes based on the LIPM’s dynamics, we can use this information both (1) to judge alternate routes according to which path is the fastest rather than simply using the the shortest path, and (2) to perform collision checking through time. The first of these capabilities will be used as our metric to select one of the potential new branches, , to be appended to . One assumption here is that the fastest path will always appear as one of the 20 shortest length paths, but we have found that this is basically always the case. The second of these capabilities means that if there are moving objects in the environment whose motion is defined or can be estimated, then this planner can accurately detect collisions in the future. Thus, unlike other planners, which may have to avoid the entire path of moving obstacles, since they select footstep locations prior to solving the biped’s dynamics, this planner plans paths that cross that of moving obstacles, as long as they are both not in the same place at the same time.
Based on all of notation and concepts introduced so far, we can summarize the locomotion planning problem in a cluttered environment as follows: the objective is to find a solution sequence connecting and and the corresponding locomotion parameters which follow the waypoints defined by that sequence, given the parameters . We limit the exploration of the space to be within a given configuration boundary and we require the solution to avoid any collisions with obstacles in the space (defined by ).
Iii Propagating LIPM Dynamics
In this section, we describe the method for recursively propagating the LIPM dynamics using the PSP along a sequence of nodes . This will allow us to algorithmically evaluate the dynamical consequences of each of the 20 potential new branches which were isolated in the previous section. From each of these new branches, we would like to determine the total duration of a biped’s movement along that branch, or . The branch with the smallest total duration will be selected for addition to .
First, we describe how we will track the dynamics of the biped throughout the tree. The locomotion parameters including the LIPM state for a step that connects adjacent nodes and are collectively stored in a variable . Since each node has a unique parent, and we calculate an for every , we can think of the set of all ’s, or , as forming a ”mirror” tree which captures the bipedal walking dynamics along the navigational waypoints in . The vector can be split into the PSP input and output, and is defined as follows,
where is the dynamically consistent and reachable step location, defined in the local frame located and oriented with , shown as in Fig. 3(b); and are the CoM position and velocity, again in the local frame, at the apex () of the step; is the time elapsed from the previous apex () to the stance-foot switching (), and is the time elapsed between the stance-foot switching () and the next apex (). Note that the apex of a step occurs when the CoM is positioned directly above the stance foot in the sagittal plane so that . The two durations and sum to yield the total duration of the step, and are illustrated in Fig. 6. Thus, once all ’s are known for any consecutive sequence of steps, , the total duration of the sequence can be obtained merely by summing and from each of the ’s in the sequence:
With this notation introduced, we can move on to describe the procedure for propagating the locomotion parameters and LIPM dynamics through a ”mirror” branch of the configuration space branch . In this situation, has been populated with a sequence of configurations, and we will start at the beginning of the branch, where the parent node () of the new node () is an existing node on the tree. As a result, (or ) is known. However, to persist generality, we will denote the new node whose locomotion parameters are to be computed as and its parent node whose locomotion parameters are already known as . We will continue from this point by focusing on a pair of nodes and with the objective being to populate , noting that the same procedure for the first pair is to be repeated for each pair in the sequence until one reaches the last pair in the branch, ending with . We calculate as follows
where __ indicates that the value is not used, is an SE(2) transformation matrix from the global coordinate frame to the coordinate frame. Once is calculated, the PSP computes with the pair of the locomotion parameters and . The PSP algorithm is described in more detail in Appendix A. Notice that the locomotion parameters calculated used in PSP are represented with respect to the parent node’s local frame . One should transform including foot placement and LIPM state into its local frame and compute in order to compute locomotion parameters for the next node. Since is augmented position and velocity vector in Cartesian space, it can be transformed by the augmented transformation matrix.
At this point, one can use the above recursive procedure along with Algorithm 1 to determine and for each of the potential new branches, . As previously mentioned, the fastest route is selected and can now be identified as the sequence connecting the ”nearest neighbor” on the existing tree to the sampled node, or . Since the time of arrival for each node in this sequence is known, this branch is ready to be collision checked through time to determine how much of the branch will be appended to . The collision checking process will be explained in further detail in the illustrated example that follows.
The overall planning algorithm is described in Algorithm 2, Fig. 4 and we demonstrate one cycle of while loop for clarification. In Fig. 4(a), and are illustrated. The green trajectory passing through and corresponding illustrates the existing node sequences which have been added to the two trees ( and ) in previous iterations. is a randomly sampled node within the boundaries of the configuration space.
In Fig. 4(b), the steering function chooses the shortest Dubin’s path (the dotted lines) from every to , generates with intermediate nodes for 20 closest nodes among Dubin’s path, and iteratively computes the sequence of locomotion parameters and . This can be done by executing Algorithm 1 repeatedly. Then we choose the path connecting the nearest neighbor to the sampled node by comparing each of the ’s and proceed to collision checking. We stipulate that a particular node and is collision-free if there is no intersection between any obstacles’ bounding boxes at the time of arrival and a circle centered at a foot placement with a radius defining a circular safety margin around the robot. can be computed by adding up time information in locomotion parameters from to . If a node is determined to collide with an obstacle or the configuration space boundary, then that node and all future nodes in that sequence are pruned. Finally, the pruned sequences and are appended to and as shown in Fig. 4(c). Fig. 4(d) shows the actual robot executing locomotion based on a sequence of locomotion parameters. The black solid line represents CoM trajectory and shows the foot placements.
Once the planner successfully connects to and compute mirroring , the unused branches are thrown away, leaving only a single sequence of steps, defined by and . Although this motion could be used, it tends to wander slightly through the space, as is typical of RRT solutions. Thus, we use a rewiring process to smooth and optimize the planned motion. To do this, we randomly select two different nodes and apply Algorithm 1 which attempts to connect the two nodes directly with a new Dubins path solution and the corresponding sequence of new nodes, and . If the new, alternate nodes are collision free and then the original nodes between and are replaced with the new, alternate nodes and . Since the locomotion parameters and LIPM state at has changed as a result of this replacement, all future nodes downstream of must be recalculated via Algorithm 3. So, are also replaced based on the update to the locomotion parameters at node . Repeating this rewiring process successively improves the solution until it closely hugs corners and dodges obstacles.
Iv Simulation result
To validate the proposed algorithm, we test it with a full human-sized bipedal robot, Valkyrie, in a dynamic simulator, srLib111Seoul National University Robotics Library. Open-source http://robotics.snu.ac.kr/srlib/. In the simulation, Valkyrie’s starting location is in the right upper corner of a maze in an room. The maze (shown in Fig. 5(a)) is formed with red walls and has three mobile robots (shown as small gray boxes) that move through known trajectories (shown as red arrows). The mission is to walk to the door at the left bottom corner of the room while avoiding static and moving obstacles. We show the problem specification in Table I.
|( has no parent, so expressed in the global frame)|
|Obstacles||Static obs. (red boxes)|
|Movement||Dynamic obs. (red arrows)|
The proposed planner successfully finds the solution route to the goal location in 70 at most including rewiring process for 108 steps. In Fig. 5(b)), the blue trajectory illustrates the original and the red trajectory shows the rewired solution. The multicolored dots in the figure are the nodes on found during the exploration process which were not part of the solution sequence. Since the algorithm is based on random sampling, the computation time and final solution from each trial is not identical. As shown in Fig. 5(c), the solution can be visualized as navigating through the time domain as well as Cartesian space. Note that static obstacles such as the walls remain stationary as time increases, while the moving obstacles do not. When the biped passes the moving obstacle revolving around the center of the maze (see the blue box magnification), some nodes are included in which cross paths with the mobile robot while it is on the other side of the circle. To control the full body motion of Valkyrie, we generate a CoM task and foot task for the WBC proposed in  from the solution sequence. Note this planning situation is significantly more complicated than most practical planning problems, which usually plan footstep sequences over much smaller distances.
Despite the fact that the above situation is somewhat unrealistically complex for practical bipedal planning problems, we chose to test the algorithm with this situation because it demonstrates some of the strengths and capabilities of our algorithm. In the process of solving that problem, the planner typically had on the order of nodes in when it found a solution. However, this algorithm works as well for simpler planning problems as it does for the highly complex one presented above. When there are fewer tight clearance passages between the starting location and the goal location as is the case in most practical biped navigation planning problems, the planner often can find and rewire a solution with ’s of steps in fractions of a second.
V Conclusion and Discussion
In this paper, we present a novel locomotion planning framework for biped robots with the combination of LIPM-PSP-kinodynamic RRT. We adapt a non-holonomic wheeled robot description to approximate some of the kinodynamic limits of bipeds and exploit a sampling based approach to exploring the configuration space. Within our kinodynamic RRT, we exploit the Dubin’s path as a steering method and propose a new elapsed time metric to select between alternate routes. The proposed planning algorithm accounts for kinematic reachability, moving obstacle avoidance, and dynamic consistency. The output of our planner can be directly used as WBC tasks that generate robust navigation and locomotion behavior.
The planning algorithm in this paper only addresses forward walking behaviors. However, one possible avenue of extending this work could be to include other bipedal capabilities to generate various combinations of walking behaviors including side steps, turning in place, or walking backward.
As another possible extension, we may modify this framework so that it can be used for real-time replanning in a complex environment where the movement of obstacles is not known a priori. Based on current observations of mobile obstacles, the robot could generate a probabilistic map of the future location of obstacles. Then, by choosing a probabilistic intolerance for collisions, one could replan routes in real-time to avoid collisions with moving objects or humans.
Appendix A Phase Space Planner
PSP generates effective step switching information using simplified models such as the LIPM. In Fig. 6, we show phase plots across multiple walking steps of the CoM sagittal and lateral phase portraits based on LIPM dynamics. In the sagittal plane, the path consists of connected parabolas, while in the lateral plane, the walking path follows semi-periodic parabolas in a closed cycle. For convenience, we will use for the sagittal plane and for the lateral plane.
Compared to other locomotion algorithms such as , , PSP computes information regarding to footstep changing (e.g. and ) with a given forward step location and an apex velocity and . In the phase plot, we can see the given current CoM state and the apex state uniquely define switching state , and . These two timing values are used to find the next lateral step location and lateral CoM position at apex. In summary, considering a one step ahead plan with a current CoM state and foot stance ( and ) and desired future states , PSP finds locomotion parameter vector to generate walking pattern for the following step.
When we constrain LIPM dynamics to a piecewise linear height surface, , we can find and without numerical integration and bisection search because the system of equations becomes linear, resulting in the following CoM behavior:
Note that this equation is the same for the direction. Based on Eq. (11), we can find an analytical solution for PSP, summarized in Algorithm 3. , , , and are vector quantities corresponding to the variables , , , and .
Let us focus on obtaining the step switching time. We can easily manipulate Eq. (11) to analytical solve for the time variable,
By performing some algebra we get,
Given two phase trajectories associated with consecutive walking steps (e.g. and ), initial states for each (e.g. and ) and assuming the robot walks forward (i.e. ) is positive, we calculate the phase space intersection point of each step’s CoM trajectory via continuity of velocities from Eq. (17):
We can now find the step switching time by plugging the computed switching position into Eqs (17) and (13). In addition, we can obtain the timing at the apex velocity from Eq. (13). The final step is to find the directional foot placement. We first calculate by plugging into the directional state equation, which has identical form to Eq. (11). Then, by using the equality that , we can find ,
After calculating , we can easily get and by using Eq. (11).
-  D. Kim, J. Lee, and L. Sentis, “Robust Dynamic Locomotion via Reinforcement Learning and Novel Whole Body Controller,” arXiv.org, 2017.
-  Y. Zhao and L. Sentis, “A three dimensional foot placement planner for locomotion in very rough terrains,” 12th IEEE-RAS International Conference on Humanoid Robots, 2012.
-  L. E. Dubins, “On Curves of Minimal Length with a Constraint on Average Curvature, and with Prescribed Initial and Terminal Positions and Tangents,” American Journal of mathematics, vol. 79, no. 3, p. 497, Jul. 1957.
-  J. J. Kuffner and S. M. LaValle, “RRT-connect: An efficient approach to single-query path planning,” in 2000 ICRA. IEEE International Conference on Robotics and Automation. IEEE, 2000, pp. 995–1001.
-  K. Harada, M. Morisawa, S.-I. Nakaoka, K. Kaneko, and S. Kajita, “Kinodynamic planning for humanoid robots walking on uneven terrain,” Journal of Robotics and Mechatronics, vol. 21, no. 3, p. 311, 2009.
K. Hauser, “Fast interpolation and time-optimization with contact,”The International Journal of Robotics Research, vol. 33, no. 9, pp. 1231–1250, Aug. 2014.
-  M. Posa, C. Cantu, and R. Tedrake, “A direct method for trajectory optimization of rigid bodies through contact,” The International Journal of Robotics Research, vol. 33, no. 1, pp. 69–81, Jan. 2014.
-  I. Mordatch, E. Todorov, and Z. Popović, “Discovery of complex behaviors through contact-invariant optimization,” ACM Transactions on Graphics (TOG), 2012.
-  T. Kunz and M. Stilman, “Probabilistically complete kinodynamic planning for robot manipulators with acceleration limits,” in 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2014). IEEE, 2014, pp. 3713–3719.
-  S. Caron, Q.-C. Pham, and Y. Nakamura, “Completeness of randomized kinodynamic planners with state-based steering,” Robotics and Autonomous Systems, vol. 89, pp. 85–94, Mar. 2017.
-  R. Deits and R. Tedrake, “Computing Large Convex Regions of Obstacle-Free Space Through Semidefinite Programming,” in Algorithmic Foundations of Robotics XI. Springer International Publishing, 2015, pp. 109–124.
-  N. Perrin, C. Ott, J. Englsberger, O. Stasse, F. Lamiraux, and D. G. Caldwell, “Continuous Legged Locomotion Planning,” IEEE Transactions on Robotics, pp. 1–6, 2016.
-  J. Englsberger, C. Ott, and A. Albu-Schaffer, “Three-Dimensional Bipedal Walking Control Based on Divergent Component of Motion,” IEEE Transactions on Robotics, vol. 31, no. 2, pp. 355–368, Mar. 2015.
S. Kajita, F. Kanehiro, K. Kaneko, K. Fujiwara, K. Harada, K. Yokoi, and H. Hirukawa, “Biped walking pattern generation by using preview control of zero-moment point,” inICRA, vol. 3, 2003, pp. 1620–1626.