Imagine an unmanned aerial vehicle (UAV) flying at high speed through a cluttered environment in the presence of wind gusts, a legged robot traversing rough terrain, or a mobile robot grasping and manipulating previously unlocalized objects in the environment. These applications demand that the robot move through (and in certain cases interact with) its environment with a very high degree of agility while still being in close proximity to obstacles. Such systems today lack guarantees on their safety and can fail dramatically in the face of uncertainty in their environment and dynamics.
The tasks mentioned above are characterized by three main challenges. First, the dynamics of the system are nonlinear, underactuated, and subject to constraints on the input (e.g. torque limits). Second, there is a significant amount of uncertainty in the dynamics of the system due to disturbances and modeling error. Finally, the geometry of the environment that the robot is operating in is unknown until runtime, thus forcing the robot to plan in real-time.
In this paper, we address these challenges by combining approaches from motion planning, feedback control, and tools from Lyapunov theory and convex optimization in order to perform robust real-time motion planning in the face of uncertainty. In particular, in an offline computation stage, we first design a finite library of open loop trajectories. For each trajectory in this library, we use tools from convex optimization (sums-of-squares (SOS) programming in particular) to design a controller that explicitly attempts to minimize the size of the worst case reachable set of the system given a description of the uncertainty in the dynamics and bounded external disturbances. This control design procedure yields an outer approximation of the reachable set, which can be visualized as a “funnel” around the trajectory, that the closed-loop system is guaranteed to remain within. A cartoon of such a funnel is shown in Figure 1(b). Finally, we provide a way of sequentially composing these robust motion plans at runtime in order to operate in a provably safe manner in previously unseen environments.
One of the most important advantages that our approach affords us is the ability to choose between the motion primitives in our library in a way that takes into account the dynamic effects of uncertainty. Imagine a UAV flying through a forest that has to choose between two motion primitives: a highly dynamic roll maneuver that avoids the trees in front of the UAV by a large margin or a maneuver that involves flying straight while avoiding the trees only by a small distance. An approach that neglects the effects of disturbances and uncertainty may prefer the former maneuver since it avoids the trees by a large margin and is therefore “safer”. However, a more careful consideration of the two maneuvers could lead to a different conclusion: the dynamic roll maneuver is far more susceptible to wind gusts and perturbations to the initial state than the second one. Thus, it may in fact be more robust to execute the second motion primitive. Further, it may be possible that neither maneuver is guaranteed to succeed and it is safer to abort the mission and simply transition to a hover mode. Our approach allows robots to make these critical decisions, which are essential if robots are to move out of labs and operate in real-world environments.
We demonstrate and validate our approach using thorough simulation experiments of ground vehicle and quadrotor models navigating through cluttered environments, along with extensive hardware experiments on a small fixed-wing airplane avoiding obstacles at high speed (12 miles per hour) in an indoor motion capture environment. To the best of our knowledge, the resulting demonstrations constitute one of the first examples of provably safe and robust control for robotic systems with complex nonlinear dynamics that need to plan in real-time in cluttered environments.
The outline of the paper is as follows. Section 2 discusses prior work; Section 3 provides a brief background on semidefinite and sums-of-squares (SOS) programming, which are used heavily throughout the paper; Section 4 shows how to use SOS programming to compute funnels; Section 5 introduces the notion of a funnel library; Section 6 describes our algorithm for using funnel libraries for real-time robust planning in environments that have not been seen by the robot before; Section 7 presents extensive simulation results on a ground vehicle model and compares our approach with an approach based on trajectory libraries; Section 7 also considers a quadrotor model and shows how one can use our approach to guarantee collision-free flight in certain environments; Section 8 presents hardware experiments on a small fixed-wing airplane in order to demonstrate and validate our approach; Section 9 concludes the paper with a discussion of challenges and open problems.
2 Relevant Work
2.1 Motion Planning
Motion planning has been the subject of significant research in the last few decades and has enjoyed a large degree of success in recent years. Planning algorithms like the Rapidly-exploring Randomized Tree (RRT) [Kuffner and Lavalle, 2000] (along with variants that attempt to find optimal motion plans [Karaman and Frazzoli, 2011] [Kobilarov, 2012]) and related trajectory library approaches [Liu and Atkeson, 2009] [Frazzoli et al., 2005] [Stolle and Atkeson, 2006] can handle large state-space dimensions and complex differential constraints. These algorithms have been successfully demonstrated on a wide variety of hardware platforms [Kuwata et al., 2008] [Shkolnik, 2010] [Sermanet et al., 2008] [Satzinger et al., 2016]. However, an important challenge is their inability to explicitly reason about uncertainty and feedback. Modeling errors, state uncertainty and disturbances can lead to failure if the system deviates from the planned nominal trajectories.
The motion planning aspect of our approach draws inspiration from the vast body of work that is focused on computing motion primitives in the form of trajectory libraries. For example, trajectory libraries have been used in diverse applications such as humanoid balance control [Liu and Atkeson, 2009], autonomous ground vehicle navigation [Sermanet et al., 2008], grasping [Berenson et al., 2007] [Dey et al., 2011], and UAV navigation [Dey et al., 2014] [Barry, 2016]. The Maneuver Automaton [Frazzoli et al., 2005] attempts to capture the formal properties of trajectory libraries as a hybrid automaton, thus providing a unifying theoretical framework. Maneuver Automata have also been used for real-time motion planning with static and dynamic obstacles [Frazzoli et al., 2002]. Further theoretical investigations have focused on the offline generation of diverse but sparse trajectories that ensure the robot’s ability to perform the necessary tasks online in an efficient manner [Green and Kelly, 2007]. More recently, tools from sub-modular sequence optimization have been leveraged in the optimization of the sequence and content of trajectories evaluated online [Dey et al., 2011, Dey, 2015]. Prior work has also been aimed at learning maneuver libraries in the form of “skill trees” from expert demonstrations [Konidaris et al., 2011].
Robust motion planning has also been a very active area of research in the robotics community. Early work [Brooks, 1982] [Lozano-Perez et al., 1984] [Jacobs and Canny, 1990] [Latombe et al., 1991] focused on settings where the dynamics of the system are not dominant and one can treat the system as a kinematic one. The problem is then one of planning paths through configuration space that are robust to uncertainty in the motion of the robot and geometry of the workspace. Our work with funnels takes inspiration from the early work on fine-motion planning, where the notions of funnels [Mason, 1985] and preimage backchaining [Lozano-Perez et al., 1984] (also known as goal regression or sequential composition) were first introduced. The theme of robust kinematic motion planning has persisted in recent work [Missiuro and Roy, 2006] [Guibas et al., 2009] [Malone et al., 2013], which deals with uncertainty in the geometry of the environment and obstacles.
2.2 Planning under Uncertainty
In settings where the dynamics of the system must be taken into account (e.g., for underactuated systems), the work on planning under uncertainty attempts to address the challenges associated with uncertainty in the dynamics, state, and geometry of the environment. In particular, chance-constrained programming [Charnes and Cooper, 1959] provides an elegant mathematical framework for reasoning about stochastic uncertainty in the dynamics, initial conditions, and obstacle geometry [Blackmore et al., 2006] [Oldewurtel et al., 2008] [Toit and Burdick, 2011]
and allows one to generate motion plans with bounds on the probability of collision with obstacles. In settings where the state of the system is partially observable and there is significant uncertainty in the observations, one can extend this framework to plan in thebelief space of the system (i.e., the space of distributions over states) [Vitus and Tomlin, 2011] [Vitus and Tomlin, 2012]. While these approaches allow one to explicitly reason about uncertainty in the system, they are typically restricted to handling linear dynamical systems with Gaussian uncertainty. This is due to the computational burden of solving chance constrained problems for nonlinear and non-Gaussian systems.
also approximate the belief state as a Gaussian distribution over state space (however, see[Platt et al., 2012] for an exception to this) for computational efficiency and hence the true belief state is not tracked. Thus, in general, one does not have robustness guarantees. The approach we take in this work is to assume that disturbances/uncertainty are bounded and provide explicit bounds on the reachable set to facilitate safe operation of the nonlinear system.
More generally, the rich literature on Partially Observable Markov Decision Processes (POMDPs)[Kaelbling et al., 1998] is also relevant here. POMDPs present an elegant mathematical framework for reasoning about uncertainty in state and dynamics. However, we note that our focus in this work is on dynamical systems with continuous state and action spaces, whereas the POMDP literature typically focuses on discretized state/action spaces for the most part.
2.3 Reachable Sets
Reachability analysis for linear and nonlinear control systems has a long history in the controls community. For linear systems subject to bounded disturbances, there exist a number of techniques for efficiently computing (approximations of) backwards and forwards reachable sets [Kurzhanski and Varaiya, 2000] [Girard, 2005] [Yazarel and Pappas, 2004]. One can apply techniques from linear reachability analysis to conservatively approximate reachable sets of nonlinear systems by treating nonlinearities as bounded disturbances [Althoff et al., 2008]. This idea has been used in [Althoff and Dolan, 2014] for performing online safety verification for ground vehicles. A similar idea was used in [Althoff et al., 2015] to perform online safety verification for UAVs and to decide when the UAV should switch to an emergency maneuver (a “loiter circle”). In this paper we will also compute outer approximations of reachable sets (which we refer to as “funnels”). However, the approach we present here is not based on a linearization of the system and thus has the potential to be less conservative for highly nonlinear systems. Further, the scope of our work extends beyond verification; the emphasis here is on safe real-time planning with funnels.
Approximations of reachable sets for nonlinear systems can be computed via Hamilton-Jacobi-Bellman (HJB) differential game formulations [Mitchell et al., 2005]. This method was used in [Gillula et al., 2010] for designing motion primitives for making a quadrotor perform an autonomous backflip in a 2D plane. While this approach handles unsafe sets that the system is not allowed to enter, it is assumed that these sets are specified a priori. In this paper, we are concerned with scenarios in which unsafe sets (such as obstacles) are not specified until runtime and must thus be reasoned about online
. Further, techniques for computing reachable sets based on the HJB equation have historically suffered from the curse of dimensionality since they rely on discretizations of the state space of the system. Hence, unless the system under consideration has special structure (e.g., decoupled systems[Chen and Tomlin, 2015]), these methods have difficulty scaling up beyond approximately 5-6 dimensional state spaces.
An approach that is closely related to our work is the work presented in [Ny and Pappas, 2012]. The authors propose a randomized planning algorithm in the spirit of RRTs that explicitly reasons about disturbances and uncertainty. Specifications of input to output stability with respect to disturbances provide a parameterization of “tubes” (analogous to our “funnels”) that can be composed together to generate motion plans that are collision-free. The factors that distinguish the approach we present in this paper from the one proposed in [Ny and Pappas, 2012] are our focus on the real-time aspect of the problem and use of sums-of-squares programming as a way of computing reachable sets. In [Ny and Pappas, 2012], the focus is on generating safe motion plans when the obstacle positions are known a priori. Further, we provide a general technique for computing and explicitly minimizing the size of tubes.
Another approach that is closely related to ours is Model Predictive Control with Tubes [Mayne et al., 2005]. The idea is to solve the optimal control problem online with guaranteed “tubes” that the trajectories stay in. A closely related idea is that of “flow-tubes”, which have been used for high-level planning for autonomous systems [Li and Williams, 2008]. However, these methods are generally limited to linear systems.
2.4 Lyapunov Theory and SOS programming
A critical component of the work presented here is the computation of “funnels” for nonlinear systems via Lyapunov functions. The metaphor for thinking about Lyapunov functions as defining funnels was introduced to the robotics community in [Burridge et al., 1999], where funnels were sequentially composed in order to produce dynamic behaviors in a robot. However, computational tools for automatically searching for Lyapunov functions were lacking until very recently. In recent years, sums-of-squares (SOS) programming has emerged as a way of checking the Lyapunov function conditions associated with each funnel [Parrilo, 2000]. The technique relies on the ability to check nonnegativity of multivariate polynomials by expressing them as a sum of squares of polynomials. This can be written as a semidefinite optimization program and is amenable to efficient computational algorithms such as interior point methods [Parrilo, 2000]. Assuming polynomial dynamics, one can check that a polynomial Lyapunov candidate, , satisfies and in some region . Importantly, the same idea can be used for computing funnels along time-indexed trajectories of a system [Tedrake et al., 2010] [Tobenkin et al., 2011]. In this paper, we will use a similar approach to synthesize feedback controllers that explicitly seek to minimize the effect of disturbances on the system by minimizing the size of the funnel computed along a trajectory. Thus, we are guaranteed that if the system starts off in the set of given initial conditions, it will remain in the computed “funnel” even if the model of the dynamics is uncertain and the system is subjected to bounded disturbances.
The ability to compute funnels using SOS programming was leveraged by the LQR-Trees algorithm [Tedrake et al., 2010] for feedback motion planning for nonlinear systems. The algorithm works by creating a tree of locally stabilizing controllers which can take any initial condition in some bounded region in state space to the desired goal. However, LQR-Trees lack the ability to handle scenarios in which the task and environment are unknown till runtime: the offline precomputation of the tree does not take into account potential runtime constraints like obstacles, and an online implementation of the algorithm is computationally infeasible.
The SOS programming approach has also been used to guarantee obstacle avoidance conditions for nonlinear systems by computing barrier certificates [Prajna, 2006] [Barry et al., 2012]. Barrier functions are similar to Lyapunov functions in spirit, but are used to guarantee that trajectories starting in some given set of initial conditions will never enter an “unsafe” region containing obstacles. This approach, however, is limited to settings where the locations and geometry of obstacles are known beforehand since the barrier certificate one computes depends on this data and computing barrier certificates in real-time using SOS programming is not computationally feasible at the present time.
In this section we provide a brief background on the key computational tools that will be employed throughout this paper.
3.1 Semidefinite Programming (SDP)
Semidefinite programs (SDPs) form an important class of convex optimization problems. They are optimization problems over the space of symmetric positive semidefinite (psd) matrices. Recall that a symmetric matrix is positive semidefinite if . Denoting the set of symmetric matrices as , a SDP in standard form is written as:
where and . In other words, a SDP involves minimizing a cost function that is linear in the elements of the decision matrix subject to linear and positive semidefiniteness constraints on .
Semidefinite programming includes Linear Programming (LP), Quadratic Programming (QP) and Second-Order Cone Programming (SOCP) as special cases. As in these other cases, SDPs are amenable to efficient numerical solution via interior point methods. The interested reader may wish to consult[Vandenberghe and Boyd, 1996] and [Blekherman et al., 2013, Section 2] for a more thorough introduction to SDPs.
3.2 Sums-of-Squares (SOS) Programming
An important application of SDPs is to check nonnegativity of polynomials. The decision problem associated with checking polynomial nonnegativity is NP–hard in general [Parrilo, 2000]. However, the problem of determining whether a polynomial is a sum-of-squares (SOS), which is a sufficient condition for nonnegativity, is amenable to efficient computation. A polynomial in indeterminates111Throughout this paper, the variables that a polynomial depend on will be referred to as “indeterminates”. This is to distinguish these variables from decision variables in our optimization problems, which will typically be the coefficients of the polynomial. is SOS if it can be written as for a set of polynomials . This condition is equivalent to the existence of a positive semidefinite (psd) matrix that satisfies:
is the vector of all monomials with degree less than or equal to half the degree of[Parrilo, 2000]. Note that the equality constraint (2) imposes linear constraints on the elements of the matrix that come from matching coefficients of the polynomials on the left and right hand sides. Thus, semidefinite programming can be used to certify that a polynomial is a sum of squares. Indeed, by allowing the coefficients of the polynomial to be decision variables, we can solve optimization problems over the space of SOS polynomials of some fixed degree. Such optimization problems are referred to as sums-of-squares (SOS) programs. The interested reader is referred to [Parrilo, 2000] and [Blekherman et al., 2013, Sections 3,4] for a more thorough introduction to SOS programming.
In addition to being able to prove global nonnegativity of polynomials, the SOS programming approach can also be used to demonstrate nonnegativity of polynomials on basic semialgebraic sets (i.e., sets described by a finite number of polynomial inequalities and equalities). Suppose we are given a set :
where and are polynomials for . We are interested in showing that a polynomial is nonnegative on the set :
We can write the following SOS constraints in order to impose (4):
Here, the polynomials and are “multiplier” polynomials analogous to Lagrange multipliers in constrained optimization. In order to see that (5) and (6) are sufficient conditions for (4), note that when a point satisfies and for (i.e., when ) then the term is non-positive. Hence, for to be nonnegative (which must be the case since is SOS), must be nonnegative. Thus, we have the desired implication in (4). This process for using multipliers to impose nonnegativity constraints on sets is known as the generalized S-procedure [Parrilo, 2000] and will be used extensively in Section 4 for computing funnels.
4 Computing Funnels
In this section we describe how the tools from Section 3 can be used to compute outer approximations of reachable sets (“funnels”) around trajectories of a nonlinear system. The approach in Section 4.1 builds on the work presented in [Tobenkin et al., 2011, Tedrake et al., 2010] while Sections 4.3.1 and 4.3.2 are based on [Majumdar and Tedrake, 2012] and [Majumdar et al., 2013] respectively. In contrast to this prior work however, we consider the problem of computing outer approximations of forwards reachable sets as opposed to inner approximations of backwards reachable sets. This leads to a few subtle differences in the cost functions of our optimization problems.
Consider the following dynamical system:
where is the state of the system at time and is the control input. Let be the nominal trajectory that we would like the system to follow and be the nominal open-loop control input. Defining new coordinates and , we can rewrite the dynamics (7) in these variables as:
We will first consider the problem of computing funnels for a closed-loop system subject to no uncertainty. To this end, we assume that we are given a feedback controller that corrects for deviations around the nominal trajectory (we will consider the problem of designing feedback controllers later in this section). We can then write the closed-loop dynamics of the system as:
Given a set of initial conditions with , our goal is to find a tight outer approximation of the set of states the system may evolve to at time . In particular, we are concerned with finding sets such that:
A funnel associated with a closed-loop dynamical system is a map from the time-interval to the power set (i.e., the set of subsets) of such that the sets satisfy the condition (10) above.
Thus, with each time , the funnel associates a set . We will parameterize the sets as sub-level sets of nonnegative time-varying functions :
Letting , the following constraint is a sufficient condition for (10):
Here, is computed as:
Intuitively, the constraint (12) says that on the boundary of the funnel (i.e., when ), the function grows slower than . Hence, states on the boundary of the funnel remain within the funnel. This intuition is formalized in [Tedrake et al., 2010, Tobenkin et al., 2011].
While any function that satisfies (12) provides us with a valid funnel, we are interested in finding tight outer approximations of the reachable set. A natural cost function for encouraging tightness is the volume of the sets . Combining this cost function with our constraints, we obtain the following optimization problem:
Note that our formulation for computing funnels differs in an important respect from a number of prior approaches on feedback motion planning and sequential composition using funnels [Tedrake et al., 2010, Burridge et al., 1999, Conner et al., 2011]. In particular, the formulation presented here considers a given set of initial conditions and attempts to bound the effects of uncertainty going forwards in time. This leads to the volume of funnels being minimized rather than maximized. This is in contrast to [Tedrake et al., 2010, Burridge et al., 1999, Conner et al., 2011], which consider a fixed goal set and seek to maximize the set of initial conditions that are stabilized to this set (which leads to a formulation where the volume of funnels is maximized). The minimization strategy here is better suited to the use of funnels for real-time planning in previously unseen environments since one is interested in bounding the effect of uncertainties on the system in order to prevent collisions with obstacles.
4.1 Numerical implementation using SOS programming
Since the optimization problem (14) involves searching over spaces of functions, it is infinite dimensional and hence not directly amenable to numerical computation. However, we can use the SOS programming approach described in Section 3 to obtain finite-dimensional optimization problems in the form of semidefinite programs (SDPs). We first concentrate on implementing the constraints in (14). We will assume that the initial condition set is a semi-algebraic set (i.e., described in terms of polynomial inequalities):
Then the constraints in (14) can be written as:
If we restrict ourselves to polynomial dynamics and polynomial functions and , these constraints are precisely in the form of (4) in Section 3.2. We can thus apply the procedure described in Section 3.2 and arrive at the following sufficient conditions for (16) and (17):
As in Section 3.2, the polynomials and are “multiplier” polynomials whose coefficients are decision variables.
Next, we focus on approximating the cost function in (14) using semidefinite programming. This can be achieved by sampling in time and replacing the integral with the finite sum . In the special case where the function is quadratic in :
the set is an ellipsoid and we can use semidefinite programming (SDP) to directly minimize the volume by maximizing the determinant of (recall that the volume of the ellipsoid is a monotonically decreasing function of the determinant of ). Note that while the problem of maximizing the determinant of a psd matrix is not directly a problem of the form (1), it can be transformed into such a form [Ben-Tal and Nemirovski, 2001, Section3]. Further note that the fact that our cost function can be handled directly in the SDP framework is in distinction to the approaches for computing inner approximations of backwards reachable sets [Tedrake et al., 2010] [Tobenkin et al., 2011] [Majumdar et al., 2013]. This is because the determinant of a psd matrix is a concave function and hence minimizing
the determinant is not a convex problem. Hence, in the previous work, the authors used heuristics for maximizing volume.
In the more general case, we can minimize an upper bound on the cost function
by introducing ellipsoids :
such that and minimizing . The containment constraint can be equivalently expressed as the constraint:
and can thus be imposed using SOS constraints:
While this optimization problem is finite dimensional, it is non-convex in general since the first constraints are bilinear in the decision variables (e.g., the coefficients of the polynomials and are multiplied together in the first constraint). To apply SOS programming, we require the constraints to be linear in the coefficients of the polynomials we are optimizing. However, note that when and are fixed, the constraints are linear in the other decision variables. Similarly, when the multipliers and are fixed, the constraints are linear in the remaining decision variables. Thus, we can efficiently perform this optimization by alternating between the two sets of decision variables and . In each step of the alternation, we can optimize our cost function . These alternations are summarized in Algorithm 1. Note that Algorithm 1 requires an initialization for and . We will discuss how to obtain these in Section 4.4.
It is easy to see that Algorithm 1 converges (though not necessarily to an optimal solution). Each iteration of the alternations is guaranteed to achieve a cost function that is at least as good as the previous iteration (since the solution from the previous iteration is a valid one). Hence, the sequence of optimal values in each iteration form a monotonically non-increasing sequence. Combined with the fact that the cost function is bounded below by 0, we conclude that this sequence converges and hence that Algorithm 1 terminates.
4.2 Approximation via time-sampling
As observed in [Tobenkin et al., 2011] in practice it is often the case that the nominal trajectory is difficult to approximate with a low degree polynomial in time. This can lead to the constraint (27) in the problem (26) having a high degree polynomial dependence on . Thus it is often useful to implement an approximation of the optimization problem (26) where the condition (16) is checked only at a finite number of sample points . We can use a piecewise linear parameterization of and can thus compute:
Similarly we can parameterize the function by polynomials at each time sample and compute:
We can then write the following modified version of the problem (26):
This program does not have any algebraic dependence on the variable and can thus provide significant computational gains over (26). However, it does not provide an exact funnel certificate. One would hope that with a sufficiently fine sampling in time, one would recover exactness. Partial results in this direction are provided in [Tobenkin et al., 2011] along with numerical examples showing that the loss of accuracy from the sampling approximation can be quite small in practice.
The problem (30) is again bilinear in the decision variables. It is linear in the two sets of decision variables and . Thus, Algorithm 1 can be applied directly to (30) with the minor modification that and are replaced by their time-sampled counterparts and the multipliers are replaced by the multipliers .
4.3 Extensions to the basic algorithm
Next we describe several extensions to the basic framework for computing funnels described in Section 4.1. Section 4.3.1 discusses the scenario in which the dynamics of the system are subject to bounded disturbances/uncertainty, Section 4.3.2 considers the problem of synthesizing feedback controllers that explicitly attempt to minimize the size of the funnel, Section 4.3.3 demonstrates how to handle input saturations, and Section 4.3.4 considers a generalization of the cost function.
4.3.1 Uncertainty in the dynamics
Suppose that the dynamics of the system are subject to an uncertainty term that models external disturbances or parametric model uncertainties. The closed-loop dynamics (9) can then be modified to capture this uncertainty:
We will assume that the dynamics depend polynomially on . Given an initial condition set as before, our goal is to find sets such that is guaranteed to be in for any valid disturbance profile:
Parameterizing the sets as sub-level sets of nonnegative time-varying functions as before, the following condition is sufficient to ensure (32):
where is computed as:
This is almost identical to the condition (12), with the exception that the function is required to decrease on the boundary of the funnel for every choice of disturbance. Assuming that the set is a semi-algebraic set , the optimization problem (26) is then easily modified by replacing condition (27) with the following constraints:
These SOS constraints now involve polynomials in the indeterminates and . Since these constraints are linear in the coefficients of the newly introduced multipliers , Algorithm 1 can be directly applied to the modified optimization problem by adding to the list of polynomials to be searched for in both Step 1 and Step 2 of the iterations. Similarly, the time-sampled approximation described in Section 4.2 can also be applied to (LABEL:eq:sos_constraints_uncertainty).
4.3.2 Feedback control synthesis
So far we have assumed that we have been provided with a feedback controller that corrects for deviations around the nominal trajectory. We now consider the problem of optimizing the feedback controller in order to minimize the size of the funnel. We will assume that the system is control affine:
and parameterize the control policy as a polynomial . We can thus write the dynamics in the coordinates as:
The feedback controller can then be optimized by adding the coefficients of the polynomial to the set of decision variables in the optimization problem (26) while keeping all the constraints unchanged. Note that appears in the constraints only through , which is now bilinear in the coefficients of and since:
With the (coefficients of) the feedback controller as part of the optimization problem, note that the constraints of the problem (26) are now bilinear in the two sets of decision variables and . Thus, in principle we could use a bilinear alternation scheme similar to the one in Algorithm 1 and alternatively optimize these two sets of decision variables. However, in this case we would not be searching for a controller that explicitly seeks to minimize the size of the funnel (since the controller would not be searched for at the same time as or , which define the funnel). To get around this issue, we add another step in each iteration where we optimize our cost function by searching for while keeping fixed. This allows us to search for and at the same time, which can significantly improve the quality of the controllers and funnels we obtain. These steps are summarized in Algorithm 2. By a reasoning identical to the one in Remark 1 it is easy to see that the sequence of optimal values produced by Algorithm 2 converges.
4.3.3 Actuator saturations
A. Approach 1
Our approach also allows us to incorporate actuator limits into the verification procedure. Although we examine the single-input case in this section, this framework is easily extended to handle multiple inputs. Let the control input at time be mapped through the following control saturation function:
Then, in a manner similar to [Tedrake et al., 2010], a piecewise analysis of can be used to check the Lyapunov conditions are satisfied even when the control input saturates. Defining:
we must check the following conditions:
where is the open-loop control input and is the feedback controller as before. These conditions can be enforced by adding additional multipliers to the optimization program (26) or its time-sampled counterpart (30).
B. Approach 2
Although one can handle multiple inputs via the above method, the number of SOS conditions grows exponentially with the number of inputs ( conditions for are needed in general to handle all possible combinations of input saturations). Thus, for systems with a large number of inputs, an alternative formulation was proposed in [Majumdar et al., 2013] that avoids this exponential growth in the size of the SOS program at the cost of adding conservativeness to the size of the funnel. Given limits on the control vector of the form , we can ask to satisfy:
This constraint implies that the applied control input remains within the specified bounds inside the verified funnel (a conservative condition). The number of extra constraints grows linearly with the number of inputs (since we have one new condition for every input), thus leading to smaller optimization problems.
4.3.4 A more general cost function
We end our discussion of extensions to the basic algorithm for computing funnels presented in Section 4.1 by considering a generalization of the cost function (volume of the funnel) we have used so far. In particular, it is sometimes useful to minimize the volume of the funnel projected onto a subspace of the state space. Suppose this projection map is given by with a corresponding projection matrix . For an ellipsoid , the projected set is also an ellipsoid with:
Recall that the ability to minimize the volume of the ellipsoid using SDP relied on being able to maximize the determinant of . In order to minimize the volume of , we would have to maximize det, which is a complicated (i.e. nonlinear) function of . Hence, in each iteration of Algorithm 1 we linearize the function det with respect to at the solution of from the previous iteration and maximize this linearization instead. The linearization of det() with respect to at a nominal value can be explicitly computed as:
where Tr refers to the trace of the matrix.
4.4 Implementation details
We end this section on computing funnels by discussing a few important implementation details.
4.4.1 Trajectory generation
An important step that is necessary for the success of our approach to computing funnels is the generation of a dynamically feasible open-loop control input and corresponding nominal trajectory . A method that has been shown to work well in practice and scale to high dimensions is the direct collocation trajectory optimization method [Betts, 2001]. While this is the approach we use for the results in Section 7, other methods like the Rapidly Exploring Randomized Tree (RRT) or its asymptotically optimal version, RRT can be used too [Kuffner and Lavalle, 2000, Karaman and Frazzoli, 2011].
4.4.2 Initializing and
Algorithms 1 and 2 require an initial guess for the functions and . In [Tedrake et al., 2010], the authors use the Lyapunov function candidate associated with a time-varying LQR controller. The control law is obtained by solving a Riccati differential equation:
with final value conditions . Here and describe the time-varying linearization of the dynamics about the nominal trajectory . The matrices and are positive-definite cost-matrices. The function:
is our initial Lyapunov candidate. Setting to a quickly increasing function such as an exponential is typically sufficient to obtain a feasible initialization. However, note that since Algorithms 1 and 2 are not guaranteed to converge to globally optimal solutions, their initialization can impact the quality of the resulting solutions. Thus, standard strategies such as multiple (possibly random) initializations can help address issues related to local solutions. Another strategy that can be used when computing libraries of funnels (Section 5) is to initialize a funnel computation with the solution computed from a neighboring maneuver in the library.
5 Funnel Libraries
5.1 Sequential composition
One can think of funnels computed using the machinery described in Section 4 as robust motion primitives (the robustness is to initial conditions and uncertainty in the dynamics). While we could define a funnel library simply as a collection of funnels and associated feedback controllers, it will be fruitful to associate some additional structure with . In particular, it is useful to know how funnels can be sequenced together to form composite robust motion plans. In order to consider this more formally, we will first introduce the notion of sequential composition of funnels [Burridge et al., 1999]. We note that our definition of sequential composition below differs slightly from the one proposed in [Burridge et al., 1999], which considers infinite-horizon control policies in contrast to the finite-horizon funnels we consider here. In this regard, our notion of a funnel (and corresponding notion of sequential composition) is akin to that of conditional invariance considered in [Kantor and Rizzi, 2003, Conner et al., 2011].
An ordered pair
An ordered pairof funnels and is sequentially composable if .
In other words, two funnels are sequentially composable if the “outlet” of one is contained within the “inlet” of the other. A pictorial depiction of this is provided in Figure 2. We note that the sequential composition of two such funnels is itself a funnel.
5.2 Exploiting invariances in the dynamics
For our purposes here, it is useful to introduce a slightly generalized notion of sequential composability that will allow us to exploit invariances (continuous symmetries) in the dynamics. In particular, the dynamics of large classes of mechanical systems such as mobile robots are often invariant under certain transformations of the state space. For Lagrangian systems, the notion of “cyclic coordinates” captures such invariances. A cyclic coordinate is a (generalized) coordinate of the system that the Lagrangian does not depend on. We can then write the dynamics of the system as a function of a state vector which is partitioned into cyclic coordinates and non-cyclic coordinates in such a way that the dynamics only depend on the non-cyclic coordinates:
For example, the dynamics of a quadrotor or fixed-wing airplane (expressed in an appropriate coordinate system) do not depend on the position of the system or the yaw angle.
Invariance of the dynamics of the system also implies that if a curve is a valid solution to the dynamics , then so is the transformed solution:
where is a mapping from the state space to itself that shifts (i.e., translates) the state vector along the cyclic coordinates (and does not transform the non-cyclic coordinates). This allows us to make the following important observation.
Suppose we are given a system whose dynamics are invariant to shifts along cyclic coordinates . Let given by be a funnel associated with this system. Then the transformed funnel given by is also a valid funnel. Hence, one can in fact think of invariances in the dynamics giving rise to an infinite family of funnels parameterized by shifts of a funnel along cyclic coordinates of the system.
Note that here we have implicitly assumed that the feedback controller:
associated with the funnel has also been transformed to: