Optimal Motion Planning for Multi-Modal Hybrid Locomotion

Hybrid locomotion, which combines multiple modalities of locomotion within a single robot, can enable robots to carry out complex tasks in diverse environments. This paper presents a novel method of combining graph search and trajectory optimization for planning multi-modal locomotion trajectories. We also introduce methods that allow the method to work tractably in higher dimensional state spaces. Through the examples of a hybrid double-integrator, amphibious robot, and the flying-driving drone, we show that our planner tractably gives full-state trajectories that are probabilistically optimal and dynamically feasible.



There are no comments yet.


page 1

page 4

page 5


Anytime Hybrid Driving-Stepping Locomotion Planning

Hybrid driving-stepping locomotion is an effective approach for navigati...

Optimization-Based Quadrupedal Hybrid Wheeled-Legged Locomotion

Hybrid wheeled-legged locomotion is a navigation paradigm only recently ...

A Single-Planner Approach to Multi-Modal Humanoid Mobility

In this work, we present an approach to planning for humanoid mobility. ...

Reactive Task and Motion Planning for Robust Whole-Body Dynamic Locomotion in Constrained Environments

Contact-based decision and planning methods are becoming increasingly im...

Autonomous Locomotion Mode Transition Simulation of a Track-legged Quadruped Robot Step Negotiation

Multi-modal locomotion (e.g. terrestrial, aerial, and aquatic) is gainin...

Finding Locomanipulation Plans Quickly in the Locomotion Constrained Manifold

We present a method that finds locomanipulation plans that perform simul...

Analytic Model for Quadruped Locomotion Task-Space Planning

Despite the extensive presence of the legged locomotion in animals, it i...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

A hybrid locomotor combines multiple movement modalities into a single platform. Examples of hybrid locomotion (HL) include amphibious vehicles with the ability to swim and drive, or flying cars with the ability to drive and fly. Hybrid locomotion can allow robots to tackle more complex tasks in complicated environments, while achieving greater performance, such as improved energy efficiency. For instance, a flying-car can readily fly over obstacles or uneven terrain via aerial mobility, while driving when possible to improve energy-efficiency. Prior works on hybrid locomotion [Ambot, aquabot, picobug, picobug2, kevin, robotbee, flyingstar] have often investigated the design and feasibility of hybrid locomotion strategies, and examples of such robots are given in Fig. 1.

Fig. 1: Top Row: A ”Drivocopter” Drone (developed by the authors) which can fly and drive. Video of operation can be accessed at https://www.youtube.com/watch?v=QZyuvXfifvQ. Bottom Row: ”Ambot” Amphibious Robot [Ambot] capable of ground and marine locomotion.
Fig. 2: Different paths that a flying-driving robot could take between building rooftops. The vehicle flies along the purple trajectory, drives as much as possible along the red trajectory, and alternates between flying and driving on the green trajectory.

However, realizing the full potential of these robots not only depends on clever design, but also on autonomous planning of their complex motion strategies. For instance, Fig. 2 illustrates different paths that a flying-driving robot (such as the Drivocopter of Fig. 1) could use to traverse between two building rooftops. The different paths have entirely different energy-costs, travel times, and robustness, which ultimately dictate robot performance in this task.

Unlike conventional motion planning problems, the path cost depends not only on the optimality of the trajectory segments in each modality, but also on the switching sequence, time, and state coordinates. The problem of achieving combinatorial optimization of the switching sequences, as well as optimization of trajectories within each modality, makes this problem particularly challenging, as illustrated by the different paths in Fig.


Many existing motion planning strategies cannot directly address the difficulty of multi-modal planning. Graph-based motion planning approaches (PRM [prm], RRT [rrt], RRT* [rrtstar]

) excel at discrete optimization in sampled coordinates, but suffer from the curse of dimensionality. They are unable to produce full-state trajectories for high-dimensional systems and do not readily incorporate costs that are not functions of sampled coordinates.

Motion planning algorithms based on trajectory optimization (e.g., via direct collocation [collocation2, gpops]) either assume smooth dynamics, or use multi-phase optimization with pre-specified domain sequences [hybridcontrolsystem]. This is due to the fact that standard nonlinear programming frameworks cannot handle the combinatorial optimization of discrete locomotion mode switches. Mixed-Integer Programming methods [mixedinteger, tobia] transcribe these problems well, but do not scale well enough to handle switching sequences and coordinates of realistic high-dimensional problems.

Existing works in multi-modal planning often bypass this problem by only considering discrete graph-based planning ([hybridswarm],[amphibiousplanning],[HyFDR]). By ignoring the continuous dynamics of the robot, these planners often ignore dynamic feasibility in a single modality, and how the dynamic constraints affect the cost. In addition, although energy expenditure is often the most important cost in hybrid locomotion, direct calculations of this cost from change in position is not possible without making vast simplifications, such as using Cost of Transport (COT) to linearly map distance to energy ([hybridswarm],[HyFDR]).

To address this problem, we present a novel motion-planning method for hybrid locomotion that considers the problem as a graph-search with local trajectory optimization on the continuous dynamics between connected nodes. In low-dimensions, we propose a graph construction strategy on the modality-partitioned state-space such that no edge crosses the guard between partitions. The cost of traveling between sample points is found using optimal trajectory generation for the corresponding modality, and a graph-search determines the nearly optimal state-space path. To make the method tractable in higher dimensional state spaces, we also propose constraining some subspace of the state-space to be a function of sampled states via virtual constraints, and learning the cost function from offline trajectory optimization batches.

To the best of our knowledge, we present the first motion planner for multi-modal locomotion that considers direct representation of the energy cost as well as dynamic constraints, while retaining probabilistic optimality. The proposed method is primarily implemented in simulation: the hybrid double-integrator with viscous friction is shown as a low-dimensional case (Sec.IV). Then, example trajectories for more-realistic systems are given by considering amphibious (Sec.V) and flying-driving locomotion (Sec.VI).

Ii Problem Formulation

Ii-a The Hybrid Locomotion System

We define a hybrid locomotion system as a type of hybrid control system [hybridcontrolsystem] with additional constraints. We define the hybrid locomotion system as a tuple

In the below description of each element, denotes the index of the locomotion mode (i.e. flying or driving):

  • describes the dynamics associated with each locomotion mode. The dynamics are assumed to take a control-affine form: .

  • is the set of domains, or state-spaces, associated with the continuous dynamics of each mode.

  • is the set of admissible control inputs associated with each mode.

  • is the set of guard surfaces that describes the boundaries between domains of mode and .

  • is the set of reset maps that describe discrete transformations on the guard surface

We additionally assume that each state belongs to a single mode . I.e., the domains disjointly partition the reachable state-space.

Ii-B Optimal Trajectories in the Hybrid Locomotion System

To define an optimal hybrid trajectory, we formulate a cost for each mode’s control-affine system in Bolza form:

There also exists a constant switching cost to transition from one modality to . We formulate the problem of finding the optimal trajectory for a hybrid locomotion system as the following two-point boundary value problem. —s— u∑Ji+J(Δi→j) ˙x=fi(x)+gi(x)u  ∀x∈Di  u∈Ui  ∀i x(t0)=x0,  x(tf)=xf In words, we want to find a trajectory that is dynamically feasible within each modality, while optimizing the cost functional throughout the entire trajectory, which would also require optimizing the order of discrete modes to visit.

Iii Planning Methodology

Section III-A reviews our basic planning concept that combines sampling-based planning with local trajectory optimization. Because this approach may not scale well to practical situations, Section III-B introduces virtual constraints and off-line learning to improve real-time performance.

Iii-a Dynamic Programming with Continuous Optimization

Iii-A1 Graph Structure

First, we discretize the problem by sampling coordinates in each domain . The vertices, , of a digraph, , are constructed from these samples. The edges represent locally optimal paths between the vertices. Each edge is weighted with the optimal transport cost. To avoid the situation where the two vertices of an edge lie in different locomotion modalities, we additionally impose the following constraints on the graph:

  1. , for some mode . I.e., edges only connect states in the same mode.

  2. We explicitly sample the guard surface and only allow paths to cross a guard through a guard sample point.

Fig.3. A and B illustrate these conditions. The shortest-path search is tackled by Djikstra’s algorithm [shortestpath] once the locally optimal trajectory costs are known.

Fig. 3: A) red edges cross a guard surface between domains and , violating the constraint on edges. B) By sampling on the guard surface and allowing no edges between to , all graph edges are constrained to a single mode. C) Node augmentation to handle modality switching costs.

If there exists a switching cost to go from one modality to another, we augment the sample on the guard surface with two connected nodes and that shares the same state-space coordinates, and assign switching cost to the edge cost between the two samples, as illustrated in Fig. 3.C.

Iii-A2 Continuous Optimization of Trajectory Segments

As each edge connects states in a single mode, we estimate the edge weight by solving the optimization problem: —s— uJiw(x1→x2)= ˙x=fi(x)+gi(x)u,  x∈

Di,  u∈Ui x(t0)=x1,  x(tf)=x2 where . This standard trajectory optimization problem can be tackled using existing methods, such as direct collocation [gpops].

Iii-A3 Final Path Smoothing

The path(s) returned from graph-search are smoothed via trajectory optimization, knowing the switching sequence and the guard surface points. Given a path of samples resulting from graph search, we partition the state space samples using their modalities, such that

where denotes the mode of each partition, and denotes the sample on the guard surface where the trajectory crosses during a mode switch. The optimal trajectories between boundary points is then found by solving the following optimization problem for each partition: —s— uJji ˙x=fi(x)+gi(x)u,   x∈Dji,   u∈Uji x(tki)=xki,  x(tki+1)=xki+1 The total trajectory is reconstructed by concatenating the trajectories induced by the optimal control inputs for each partition.

Iii-B Extension to High Dimensions

Although dynamic programming with segment-wise trajectory optimization shows good promise for hybrid locomotion, it is computationally expensive, requiring instances of trajectory optimization. In high dimensions, the number of samples increases exponentially if the resolution is maintained, and trajectory optimization methods scale poorly. The two methods introduced in this section aim to make this method tractable for high-dimensional systems.

Iii-B1 Reduced-Order Coordinates via Virtual Constraints

We can reduce the dimensionality of the sample space by introducing virtual constraints that fix some coordinates as a function of the sampled-coordinates. The state-space is divided into sampled coordinates () and auxiliary coordinates (). The full state is recovered from samples in the reduced space, , by appending auxiliary coordinates:


The state partioning into and are problem-dependent, but can be understood in the context of model-order reduction and bisimilarity: if the original system and the virtually-constrained system show bounded difference in their evolution, it indicates a good choice of coordinates and constraints. Rigid body coordinates of position and velocity, or differentially flat coordinates [minimumsnap] can be good choices. Eliminating the sampling of the subspace can significantly reduce computation, making the method tractable.

Iii-B2 Learning the cost function from offline optimization

To find the weight between two sampled coordinates and in the graph, let us first define a function , which is described by the following optimization problem: —s— uJiJ(x1^s,x2^s)= ˙x=fi(x)+gi(x)u,   x∈Di,   u∈Ui x0=(x1sv(x1s)), x(tf)=(x2sv(x2s)) where . Since this optimization problem has to be solved times, we choose to learn this function off-line for faster evaluation online.


as feature vectors, and

as label, we first produce a batch

from multiple trajectory optimization runs. Then, function approximators from supervised learning algorithms such as Support Vector Regression (SVR)

[svr] or Neural Nets are used to approximate . Denoting the approximated function as , the weights on the graph are assigned by .

Since is learned from data, its evaluation does not require a full instance of nonlinear programming, greatly reducing on-line computation. Yet, as is learned from trajectory optimization, all costs in Bolza form can be utilized, and dynamic constraints can be incorporated.

Iv Case Study: Hybrid Double Integrator

This section first verifies our low-dimensional method for 1D problem of a thrust-vectored mass on a linear rail. While the rail is lubricated and frictionless at , viscous drag appears . This can be formulated as a simple hybrid locomotion system with the following dynamics:

In addition, consider that we have the input constraint for both domains. Converting this to a first-order system , the system can be described as:

where the dynamics are described by

Then, let us find a trajectory from to while minimizing the input

Using our framework, we first place a graph structure on the state-space using knowledge of the domains , then optimize each continuous trajectory using GPOPS-II [gpops] with IPOPT [ipopt] solver. The trajectory obtained using graph-search, and the final smoothened trajectory using the knowledge of the switching sequence and the boundary points on the guard surface is displayed in Fig.4.

Finally, since the switching sequence is trivial to guess for this example, we utilize multi-phase optimization in GPOPS-II with IPOPT, which puts an equality constraint from the end of the first phase in and the beginning of the second phase in and compare the results. The trajectory using multi-phase optimization is displayed in Fig.4.

In addition, to show probabilistic convergence, we run the algorithm times with different inter-sample distances (controlled by Poisson sampling [poissonsampling] on state-space), and show convergence in Fig. 5. Fig. 5 shows that our method probabilistically converges, with a lower cost compared to multi-phase optimization using GPOPS-II with IPOPT. IPOPT is only local optimal, while a PRM framework searches more globally over the domain. Our final cost shows that we can produce probabilistically optimal and dynamically feasible trajectories with inter-sample distance as large as .

Fig. 4: Optimal Trajectories from to (left), and from to (right). Red trajectories are obtained using graph search, green trajectories are results of final smoothened path, and the pink trajectory is result of Multi-phase optimization in GPOPS-II. The edges represent optimal trajectories between each sample.
Fig. 5: Probabilistic convergence with decreasing intersample distance.

V 2D Case Study: Amphibious Tank (AmBot)

This section describes optimal trajectories for the amphibious vehicle introduced in [Ambot], which uses tank treads for ground locomotion (skid-steer), and marine locomotion (paddles). After describing the vehicle dynamics in both modes, we obtain optimal trajectories for a model environment.

V-a Dynamics

V-A1 Ground and Marine Dynamics

We derive the Newtonian mechanics for planar operation, and incorporate first-order armature motor dynamics. The ground states, , and marine states, , are defined as

where is the body position with respect to (wrt) a world frame, is the velocity in the body frame, and denote the orientation and angular velocity wrt a world frame. Finally, denote the left and right motor speeds. In both locomotion modes, the control action correspond to commanded motor speeds via fraction of applied motor voltage.

We model a no slip constraint for ground operation. A -order motor model relates motor torque (which generates tractive forces on the vehicle) to command inputs. A drag force proportional to the square of vehicle speed and a similar -order motor model are used in the aquatic domain.

V-A2 Hybrid Dynamics

The governing dynamical systems for each mode are represented by the hybrid dynamics

where denotes ground and marine modes, the domains and guard surfaces are obtained from terrain, and for both inputs. We apply the identity map to and .

V-A3 Cost Function

We minimize the robot’s total energy expenditure, modeled as:


where is the battery voltage, is the motor torque-constant, the internal resistance, and the constant power drain. The first term models motor power dissipation, and the latter term models constant power drainage. We assume no switching costs associated with the discrete reset map,

V-B Cost Learning

For ground operation, we divide the state-space into sampled and auxiliary coordinates


This division of coordinates recognizes that side-slip is constrained for skid-steer vehicles, and angular velocity is small.

Fig. 6: Left: 11520 ground trajectories colored by their cost. Right: -energy contour for , . The heatmap corresponds to the energy cost to go from to

For marine operation, is divided into

where track forces equal water drag at equilibrium speed .

The cost function in Eq. (III-B2) is learned from multiple offline optimizations. Using samples, the function is evaluated using GPOPS-II [gpops], and SVR with Gaussian kernel trains the function with Sequential Minimal Optimization [SVM]. The process is repeated for both ground and marine locomotion. Fig.6 shows the generated trajectories and the contour of the learned function.

V-C Results

We sample position using the method of Sec.III.A, and grid the states to create . The edge weights are estimated from the learned function . Finally, the shortest path is found by Djikstra’s algorithm [shortestpath]. Fig. 7 illustrates this process. The final smoothed trajectory is shown in Fig.8.

The final trajectories differ noticeably from those produced by a shortest-path planner due to the differences in Costs of Transport. Since the robot expends more energy in water, it drives further on the ground until it switches to swimming. This example shows that our method can autonomously decide switching sequences and switching points.

Fig. 7: A: Model Environment. B: Graph Generation. C. Result of shortest path search.
Fig. 8: Final trajectories for example of river crossing (left), and island crossing (right). The Robot outline is displayed at equal time differences.

Cost from Fixed Sequence Heuristic Trajectories (Joules)

Our Method
Dist. (m) F DF DFD FDF DFDF DFDFD Sequence Cost (Joules)
110 12129.72 11991.18 11780.26 6612.62 6620.16 6642.41 DFDFD 6655.44
90 10167.32 9902.18 9867.10 6345.64 6558.94 6578.74 DFDFD 6600.45
70 8208.72 7941.78 7726.73 6470.98 6492.85 6511.88 DFD 8156.85
50 6248.72 6112.78 5894.58 6477.60 6449.91 6452.32 DF 6456.37
30 4244.2 4148.18 3815.68 6343.64 6365.65 6387.47 DFD 4033.92
TABLE I: Comparison of our costs from hueristic trajectories with fixed sequences. D: Driving, F: Flying
Fig. 9: Depiction of trajectory differences as the platforms are further separated. At less than separation, the robot always flies. After meters separation, driving in the chasm saves energy.

Vi 3D Case Study: Drivocopter

This section models the Drivocopter flying-driving drone of Fig. 1. It uses skid-steer driving and quadrotor flight.

Vi-a Dynamics

We use the ground model of Sec.V with different parameters, while the flight dynamics are based on [omnicopter] and [minimumenergy].

Vi-A1 Flight Dynamics

Standard rigid-body dynamics [quadrotordynamics] describe flight motions driven by four rotor forces, which use a speed-squared-dependent lift term and -order armature motor dynamics. The state vector is

where is the vehicle position wrt a world frame, is the 3D velocity in the body frame, denotes vehicle orientation wrt world frame parametrized by ZYX Euler angles, is the body angular velocity, and are the motor rotational speeds.

Vi-A2 Hybrid Dynamics

Again, the two modalities of ground and flight are represented by a hybrid dynamical system

where denotes flight and ground modes, the domains and guard surfaces are obtained from knowledge of the ground surface. The motor inputs are with . Finally, (landing) and (takeoff) are discrete transitions:

where is the motor speed needed to provide hovering lift. During takeoff, we set to be a meter higher than the ground surface of the ground sample.

Vi-A3 Cost Function

We use the same ground energy cost as Eq.(2), and formulate the same energy for flight with different motor parameters. The costs for reset maps and are constant takeoff and landing energy costs obtained via trajectory optimization.

Vi-B Cost Learning

Fig. 10: Left: trajectories produced to learn the flight energy function. Right: xz-projection of the learned function .

The ground states are divided into sampled / auxiliary coordinates via Eq. (3). Flight states are divided by:

where the is the rotor rate at which the lift provided by the propellers allows the drone to hover in stable equilibrium. The cost is learned as in Sec.V.B from paths. Fig. 10 shows the trajectories and energy map. The ground energy cost is found with Drivocopter parameters.

Vi-C Results

Fig. 11:

A. Model Terrain classified into drivable and undrivable terrains. B. Poisson sampling on ground mesh. C. Poisson sampling on air and shortest path search. D. Smoothened final path. E. Heuristic trajectories for comparison in Tab.

I. From back to front: F (red), DF (green), DFD (blue), FDF (pink), DFDF (black), DFDFD (cyan)

A CAD environment model, consisting of two raised platforms separated by a flat-bottom chasm, is meshed into drivable and undrivable regions (Fig.11.A), and the ground and free-space meshes are Poisson sampled (Fig.11.B). The result of a shortest-path (Fig.11.C) is smoothened (Fig.11.D). This process is depicted in Fig. 11. We hypothesized that when the platforms are nearby, the drone should not drive in the chasm, since gravitational losses exceed energy gains from by driving. As the platforms separate further, the drone saves energy by driving in the chasm. We tested this idea on 5 different terrains parametrized by the distance between platforms (see Fig.9). Our planning results show correct qualitative behavior.

To show quantitative competence, we also generate few heuristic trajectories per given fixed sequence (illustrated in Fig.11.E) and tabulate the final costs in Table.I. Our method produces a switching sequence that mostly agrees with lowest-cost producing sequences among heuristic trajectories, and costs are quantitatively comparable to the heuristically optimal trajectories.

Vii Conclusion

We presented a novel scheme to plan near-optimal hybrid locomotion trajectories. A double-integrator example showed that our method can generate probabilistically optimal and dynamically feasible trajectories in low-dimensional state-spaces. The Ambot and Drivocopter examples showed that virtual constraints and cost function learning renders our method practical in high-dimensional problems.

Improvements are possible by upgrading components of our framework. Better computational speed could be realized by adaptive sampling [hsu] and the use of RRT [rrt] search to achieve faster single-query tractability. An (A*) [astar] graph search would be enabled by transport energy heuristics, while other function approximations, such as Neural Nets, might improve the cost function learning module. Differential Dynamic Programming [ddp] is another promising scheme for planning trajectory segments.

While we have demonstrated probabilistic optimality in a low-dimensional example, we acknowledge the lack of provable optimality in high dimensions. A better understanding and justification of virtual constraints should result from studying the bisimilarity between the full-state and reduced-order systems. Finally, efforts are underway to demonstrate our results on the Drivocopter of Fig. 1.