I Introduction
A hybrid locomotor combines multiple movement modalities into a single platform. Examples of hybrid locomotion (HL) include amphibious vehicles with the ability to swim and drive, or flying cars with the ability to drive and fly. Hybrid locomotion can allow robots to tackle more complex tasks in complicated environments, while achieving greater performance, such as improved energy efficiency. For instance, a flyingcar can readily fly over obstacles or uneven terrain via aerial mobility, while driving when possible to improve energyefficiency. Prior works on hybrid locomotion [Ambot, aquabot, picobug, picobug2, kevin, robotbee, flyingstar] have often investigated the design and feasibility of hybrid locomotion strategies, and examples of such robots are given in Fig. 1.
However, realizing the full potential of these robots not only depends on clever design, but also on autonomous planning of their complex motion strategies. For instance, Fig. 2 illustrates different paths that a flyingdriving robot (such as the Drivocopter of Fig. 1) could use to traverse between two building rooftops. The different paths have entirely different energycosts, travel times, and robustness, which ultimately dictate robot performance in this task.
Unlike conventional motion planning problems, the path cost depends not only on the optimality of the trajectory segments in each modality, but also on the switching sequence, time, and state coordinates. The problem of achieving combinatorial optimization of the switching sequences, as well as optimization of trajectories within each modality, makes this problem particularly challenging, as illustrated by the different paths in Fig.
2.Many existing motion planning strategies cannot directly address the difficulty of multimodal planning. Graphbased motion planning approaches (PRM [prm], RRT [rrt], RRT* [rrtstar]
) excel at discrete optimization in sampled coordinates, but suffer from the curse of dimensionality. They are unable to produce fullstate trajectories for highdimensional systems and do not readily incorporate costs that are not functions of sampled coordinates.
Motion planning algorithms based on trajectory optimization (e.g., via direct collocation [collocation2, gpops]) either assume smooth dynamics, or use multiphase optimization with prespecified domain sequences [hybridcontrolsystem]. This is due to the fact that standard nonlinear programming frameworks cannot handle the combinatorial optimization of discrete locomotion mode switches. MixedInteger Programming methods [mixedinteger, tobia] transcribe these problems well, but do not scale well enough to handle switching sequences and coordinates of realistic highdimensional problems.
Existing works in multimodal planning often bypass this problem by only considering discrete graphbased planning ([hybridswarm],[amphibiousplanning],[HyFDR]). By ignoring the continuous dynamics of the robot, these planners often ignore dynamic feasibility in a single modality, and how the dynamic constraints affect the cost. In addition, although energy expenditure is often the most important cost in hybrid locomotion, direct calculations of this cost from change in position is not possible without making vast simplifications, such as using Cost of Transport (COT) to linearly map distance to energy ([hybridswarm],[HyFDR]).
To address this problem, we present a novel motionplanning method for hybrid locomotion that considers the problem as a graphsearch with local trajectory optimization on the continuous dynamics between connected nodes. In lowdimensions, we propose a graph construction strategy on the modalitypartitioned statespace such that no edge crosses the guard between partitions. The cost of traveling between sample points is found using optimal trajectory generation for the corresponding modality, and a graphsearch determines the nearly optimal statespace path. To make the method tractable in higher dimensional state spaces, we also propose constraining some subspace of the statespace to be a function of sampled states via virtual constraints, and learning the cost function from offline trajectory optimization batches.
To the best of our knowledge, we present the first motion planner for multimodal locomotion that considers direct representation of the energy cost as well as dynamic constraints, while retaining probabilistic optimality. The proposed method is primarily implemented in simulation: the hybrid doubleintegrator with viscous friction is shown as a lowdimensional case (Sec.IV). Then, example trajectories for morerealistic systems are given by considering amphibious (Sec.V) and flyingdriving locomotion (Sec.VI).
Ii Problem Formulation
Iia The Hybrid Locomotion System
We define a hybrid locomotion system as a type of hybrid control system [hybridcontrolsystem] with additional constraints. We define the hybrid locomotion system as a tuple
In the below description of each element, denotes the index of the locomotion mode (i.e. flying or driving):

describes the dynamics associated with each locomotion mode. The dynamics are assumed to take a controlaffine form: .

is the set of domains, or statespaces, associated with the continuous dynamics of each mode.

is the set of admissible control inputs associated with each mode.

is the set of guard surfaces that describes the boundaries between domains of mode and .

is the set of reset maps that describe discrete transformations on the guard surface
We additionally assume that each state belongs to a single mode . I.e., the domains disjointly partition the reachable statespace.
IiB Optimal Trajectories in the Hybrid Locomotion System
To define an optimal hybrid trajectory, we formulate a cost for each mode’s controlaffine system in Bolza form:
There also exists a constant switching cost to transition from one modality to . We formulate the problem of finding the optimal trajectory for a hybrid locomotion system as the following twopoint boundary value problem. —s— u∑Ji+J(Δi→j) ˙x=fi(x)+gi(x)u ∀x∈Di u∈Ui ∀i x(t0)=x0, x(tf)=xf In words, we want to find a trajectory that is dynamically feasible within each modality, while optimizing the cost functional throughout the entire trajectory, which would also require optimizing the order of discrete modes to visit.
Iii Planning Methodology
Section IIIA reviews our basic planning concept that combines samplingbased planning with local trajectory optimization. Because this approach may not scale well to practical situations, Section IIIB introduces virtual constraints and offline learning to improve realtime performance.
Iiia Dynamic Programming with Continuous Optimization
IiiA1 Graph Structure
First, we discretize the problem by sampling coordinates in each domain . The vertices, , of a digraph, , are constructed from these samples. The edges represent locally optimal paths between the vertices. Each edge is weighted with the optimal transport cost. To avoid the situation where the two vertices of an edge lie in different locomotion modalities, we additionally impose the following constraints on the graph:

, for some mode . I.e., edges only connect states in the same mode.

We explicitly sample the guard surface and only allow paths to cross a guard through a guard sample point.
Fig.3. A and B illustrate these conditions. The shortestpath search is tackled by Djikstra’s algorithm [shortestpath] once the locally optimal trajectory costs are known.
If there exists a switching cost to go from one modality to another, we augment the sample on the guard surface with two connected nodes and that shares the same statespace coordinates, and assign switching cost to the edge cost between the two samples, as illustrated in Fig. 3.C.
IiiA2 Continuous Optimization of Trajectory Segments
As each edge connects states in a single mode, we estimate the edge weight by solving the optimization problem: —s— uJiw(x1→x2)= ˙x=fi(x)+gi(x)u, x∈
Di, u∈Ui x(t0)=x1, x(tf)=x2 where . This standard trajectory optimization problem can be tackled using existing methods, such as direct collocation [gpops].IiiA3 Final Path Smoothing
The path(s) returned from graphsearch are smoothed via trajectory optimization, knowing the switching sequence and the guard surface points. Given a path of samples resulting from graph search, we partition the state space samples using their modalities, such that
where denotes the mode of each partition, and denotes the sample on the guard surface where the trajectory crosses during a mode switch. The optimal trajectories between boundary points is then found by solving the following optimization problem for each partition: —s— uJji ˙x=fi(x)+gi(x)u, x∈Dji, u∈Uji x(tki)=xki, x(tki+1)=xki+1 The total trajectory is reconstructed by concatenating the trajectories induced by the optimal control inputs for each partition.
IiiB Extension to High Dimensions
Although dynamic programming with segmentwise trajectory optimization shows good promise for hybrid locomotion, it is computationally expensive, requiring instances of trajectory optimization. In high dimensions, the number of samples increases exponentially if the resolution is maintained, and trajectory optimization methods scale poorly. The two methods introduced in this section aim to make this method tractable for highdimensional systems.
IiiB1 ReducedOrder Coordinates via Virtual Constraints
We can reduce the dimensionality of the sample space by introducing virtual constraints that fix some coordinates as a function of the sampledcoordinates. The statespace is divided into sampled coordinates () and auxiliary coordinates (). The full state is recovered from samples in the reduced space, , by appending auxiliary coordinates:
(1) 
The state partioning into and are problemdependent, but can be understood in the context of modelorder reduction and bisimilarity: if the original system and the virtuallyconstrained system show bounded difference in their evolution, it indicates a good choice of coordinates and constraints. Rigid body coordinates of position and velocity, or differentially flat coordinates [minimumsnap] can be good choices. Eliminating the sampling of the subspace can significantly reduce computation, making the method tractable.
IiiB2 Learning the cost function from offline optimization
To find the weight between two sampled coordinates and in the graph, let us first define a function , which is described by the following optimization problem: —s— uJiJ(x1^s,x2^s)= ˙x=fi(x)+gi(x)u, x∈Di, u∈Ui x0=(x1sv(x1s)), x(tf)=(x2sv(x2s)) where . Since this optimization problem has to be solved times, we choose to learn this function offline for faster evaluation online.
Using
as feature vectors, and
as label, we first produce a batchfrom multiple trajectory optimization runs. Then, function approximators from supervised learning algorithms such as Support Vector Regression (SVR)
[svr] or Neural Nets are used to approximate . Denoting the approximated function as , the weights on the graph are assigned by .Since is learned from data, its evaluation does not require a full instance of nonlinear programming, greatly reducing online computation. Yet, as is learned from trajectory optimization, all costs in Bolza form can be utilized, and dynamic constraints can be incorporated.
Iv Case Study: Hybrid Double Integrator
This section first verifies our lowdimensional method for 1D problem of a thrustvectored mass on a linear rail. While the rail is lubricated and frictionless at , viscous drag appears . This can be formulated as a simple hybrid locomotion system with the following dynamics:
In addition, consider that we have the input constraint for both domains. Converting this to a firstorder system , the system can be described as:
where the dynamics are described by
Then, let us find a trajectory from to while minimizing the input
Using our framework, we first place a graph structure on the statespace using knowledge of the domains , then optimize each continuous trajectory using GPOPSII [gpops] with IPOPT [ipopt] solver. The trajectory obtained using graphsearch, and the final smoothened trajectory using the knowledge of the switching sequence and the boundary points on the guard surface is displayed in Fig.4.
Finally, since the switching sequence is trivial to guess for this example, we utilize multiphase optimization in GPOPSII with IPOPT, which puts an equality constraint from the end of the first phase in and the beginning of the second phase in and compare the results. The trajectory using multiphase optimization is displayed in Fig.4.
In addition, to show probabilistic convergence, we run the algorithm times with different intersample distances (controlled by Poisson sampling [poissonsampling] on statespace), and show convergence in Fig. 5. Fig. 5 shows that our method probabilistically converges, with a lower cost compared to multiphase optimization using GPOPSII with IPOPT. IPOPT is only local optimal, while a PRM framework searches more globally over the domain. Our final cost shows that we can produce probabilistically optimal and dynamically feasible trajectories with intersample distance as large as .
V 2D Case Study: Amphibious Tank (AmBot)
This section describes optimal trajectories for the amphibious vehicle introduced in [Ambot], which uses tank treads for ground locomotion (skidsteer), and marine locomotion (paddles). After describing the vehicle dynamics in both modes, we obtain optimal trajectories for a model environment.
Va Dynamics
VA1 Ground and Marine Dynamics
We derive the Newtonian mechanics for planar operation, and incorporate firstorder armature motor dynamics. The ground states, , and marine states, , are defined as
where is the body position with respect to (wrt) a world frame, is the velocity in the body frame, and denote the orientation and angular velocity wrt a world frame. Finally, denote the left and right motor speeds. In both locomotion modes, the control action correspond to commanded motor speeds via fraction of applied motor voltage.
We model a no slip constraint for ground operation. A order motor model relates motor torque (which generates tractive forces on the vehicle) to command inputs. A drag force proportional to the square of vehicle speed and a similar order motor model are used in the aquatic domain.
VA2 Hybrid Dynamics
The governing dynamical systems for each mode are represented by the hybrid dynamics
where denotes ground and marine modes, the domains and guard surfaces are obtained from terrain, and for both inputs. We apply the identity map to and .
VA3 Cost Function
We minimize the robot’s total energy expenditure, modeled as:
(2) 
where is the battery voltage, is the motor torqueconstant, the internal resistance, and the constant power drain. The first term models motor power dissipation, and the latter term models constant power drainage. We assume no switching costs associated with the discrete reset map,
VB Cost Learning
For ground operation, we divide the statespace into sampled and auxiliary coordinates
(3) 
This division of coordinates recognizes that sideslip is constrained for skidsteer vehicles, and angular velocity is small.
For marine operation, is divided into
where track forces equal water drag at equilibrium speed .
The cost function in Eq. (IIIB2) is learned from multiple offline optimizations. Using samples, the function is evaluated using GPOPSII [gpops], and SVR with Gaussian kernel trains the function with Sequential Minimal Optimization [SVM]. The process is repeated for both ground and marine locomotion. Fig.6 shows the generated trajectories and the contour of the learned function.
VC Results
We sample position using the method of Sec.III.A, and grid the states to create . The edge weights are estimated from the learned function . Finally, the shortest path is found by Djikstra’s algorithm [shortestpath]. Fig. 7 illustrates this process. The final smoothed trajectory is shown in Fig.8.
The final trajectories differ noticeably from those produced by a shortestpath planner due to the differences in Costs of Transport. Since the robot expends more energy in water, it drives further on the ground until it switches to swimming. This example shows that our method can autonomously decide switching sequences and switching points.
Cost from Fixed Sequence Heuristic Trajectories (Joules) 
Our Method  
Dist. (m)  F  DF  DFD  FDF  DFDF  DFDFD  Sequence  Cost (Joules) 
110  12129.72  11991.18  11780.26  6612.62  6620.16  6642.41  DFDFD  6655.44 
90  10167.32  9902.18  9867.10  6345.64  6558.94  6578.74  DFDFD  6600.45 
70  8208.72  7941.78  7726.73  6470.98  6492.85  6511.88  DFD  8156.85 
50  6248.72  6112.78  5894.58  6477.60  6449.91  6452.32  DF  6456.37 
30  4244.2  4148.18  3815.68  6343.64  6365.65  6387.47  DFD  4033.92 
Vi 3D Case Study: Drivocopter
This section models the Drivocopter flyingdriving drone of Fig. 1. It uses skidsteer driving and quadrotor flight.
Via Dynamics
We use the ground model of Sec.V with different parameters, while the flight dynamics are based on [omnicopter] and [minimumenergy].
ViA1 Flight Dynamics
Standard rigidbody dynamics [quadrotordynamics] describe flight motions driven by four rotor forces, which use a speedsquareddependent lift term and order armature motor dynamics. The state vector is
where is the vehicle position wrt a world frame, is the 3D velocity in the body frame, denotes vehicle orientation wrt world frame parametrized by ZYX Euler angles, is the body angular velocity, and are the motor rotational speeds.
ViA2 Hybrid Dynamics
Again, the two modalities of ground and flight are represented by a hybrid dynamical system
where denotes flight and ground modes, the domains and guard surfaces are obtained from knowledge of the ground surface. The motor inputs are with . Finally, (landing) and (takeoff) are discrete transitions:
where is the motor speed needed to provide hovering lift. During takeoff, we set to be a meter higher than the ground surface of the ground sample.
ViA3 Cost Function
We use the same ground energy cost as Eq.(2), and formulate the same energy for flight with different motor parameters. The costs for reset maps and are constant takeoff and landing energy costs obtained via trajectory optimization.
ViB Cost Learning
The ground states are divided into sampled / auxiliary coordinates via Eq. (3). Flight states are divided by:
where the is the rotor rate at which the lift provided by the propellers allows the drone to hover in stable equilibrium. The cost is learned as in Sec.V.B from paths. Fig. 10 shows the trajectories and energy map. The ground energy cost is found with Drivocopter parameters.
ViC Results
A CAD environment model, consisting of two raised platforms separated by a flatbottom chasm, is meshed into drivable and undrivable regions (Fig.11.A), and the ground and freespace meshes are Poisson sampled (Fig.11.B). The result of a shortestpath (Fig.11.C) is smoothened (Fig.11.D). This process is depicted in Fig. 11. We hypothesized that when the platforms are nearby, the drone should not drive in the chasm, since gravitational losses exceed energy gains from by driving. As the platforms separate further, the drone saves energy by driving in the chasm. We tested this idea on 5 different terrains parametrized by the distance between platforms (see Fig.9). Our planning results show correct qualitative behavior.
To show quantitative competence, we also generate few heuristic trajectories per given fixed sequence (illustrated in Fig.11.E) and tabulate the final costs in Table.I. Our method produces a switching sequence that mostly agrees with lowestcost producing sequences among heuristic trajectories, and costs are quantitatively comparable to the heuristically optimal trajectories.
Vii Conclusion
We presented a novel scheme to plan nearoptimal hybrid locomotion trajectories. A doubleintegrator example showed that our method can generate probabilistically optimal and dynamically feasible trajectories in lowdimensional statespaces. The Ambot and Drivocopter examples showed that virtual constraints and cost function learning renders our method practical in highdimensional problems.
Improvements are possible by upgrading components of our framework. Better computational speed could be realized by adaptive sampling [hsu] and the use of RRT [rrt] search to achieve faster singlequery tractability. An (A*) [astar] graph search would be enabled by transport energy heuristics, while other function approximations, such as Neural Nets, might improve the cost function learning module. Differential Dynamic Programming [ddp] is another promising scheme for planning trajectory segments.
While we have demonstrated probabilistic optimality in a lowdimensional example, we acknowledge the lack of provable optimality in high dimensions. A better understanding and justification of virtual constraints should result from studying the bisimilarity between the fullstate and reducedorder systems. Finally, efforts are underway to demonstrate our results on the Drivocopter of Fig. 1.
Comments
There are no comments yet.