GuSTO: Guaranteed Sequential Trajectory Optimization via Sequential Convex Programming

03/01/2019 ∙ by Riccardo Bonalli, et al. ∙ 0

Sequential Convex Programming (SCP) has recently seen a surge of interest as a tool for trajectory optimization. However, most available methods lack rigorous performance guarantees and they are often tailored to specific optimal control setups. In this paper, we present GuSTO (Guaranteed Sequential Trajectory Optimization), an algorithmic framework to solve trajectory optimization problems for control-affine systems with drift. GuSTO generalizes earlier SCP-based methods for trajectory optimization (by addressing, for example, goal-set constraints and problems with either fixed or free final time) and enjoys theoretical convergence guarantees in terms of convergence to, at least, a stationary point. The theoretical analysis is further leveraged to devise an accelerated implementation of GuSTO, which originally infuses ideas from indirect optimal control into an SCP context. Numerical experiments on a variety of trajectory optimization setups show that GuSTO generally outperforms current state-of-the-art approaches in terms of success rates, solution quality, and computation times.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 5

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Trajectory optimization algorithms play a key role in robot motion planning, either being applied directly to solve planning problems or being used to refine coarse trajectories generated by other methods. A wide variety of algorithmic frameworks have been proposed [1, 2, 3, 4, 5, 6, 7, 8], and though they have had success on a broad class of robotic systems, a large gap remains in establishing practical guidelines for applying trajectory optimization to new systems and problem setups, placing guarantees on their behavior, and fully exploiting optimal control theory to improve performance.

In particular, additional work is required to achieve more general, well-analyzed frameworks for trajectory optimization algorithms which meet the following key desiderata:

  1. High computational speed: Even on high-dimensional systems having complex dynamics and constraints, trajectory optimization algorithms should converge rapidly, allowing quick responses to commands and rapid replanning in uncertain or changing environments.

  2. Theoretical guarantees: A reliable framework hinges on strong theoretical guarantees. Specifically, trajectory optimization algorithms should (i) guarantee initialization-independent convergence to, at least, a stationary point, (ii) ensure hard enforcement of dynamical constraints, especially as many robotic systems are nonholonomic, and (iii) provide that these guarantees are discretization-independent, since some robotic systems may call for specific numerical schemes.

  3. Generality: Trajectory optimization frameworks should be broadly applicable to different robot motion planning problems, including involving complex robotic systems (e.g., nonconvex, nonholonomic dynamics, drift systems, etc.), flexible problem setups (e.g., free final time, goal sets, etc.), and diverse initialization strategies.

Fig. 1: GuSTO used to generate a dynamically-feasible, collision-free trajectory for the Astrobee free-flying spacecraft robot using a simple straight-line initialization [9].

Related work: The trajectory optimization spectrum can be divided into global search methods and local methods. Global search methods include motion planning techniques, such as asymptotically optimal sampling-based motion planning (SBP) algorithms (e.g., RRT, PRM, and FMT) [10, 11, 12]. Though these require no initialization, they scale poorly to high-dimensional systems with kinodynamic constraints. For such systems, SBP techniques require enormous computational time and are thus instead used in practice to initialize other trajectory optimization algorithms.

Local methods include indirect methods, in particular including shooting methods [8]. Built on an efficient coupling of necessary conditions of optimality, such as the Pontryagin Maximum Principle [13], and Newton’s methods, these have the fastest convergence rate, but they are highly sensitive to initialization and are thus difficult to apply to different tasks. Another class of efficient local procedures is direct methods. One of these, which is the focus of this paper, is sequential convex programming (SCP), a framework which has been quite successful in the robotics community [14, 1, 2, 3, 15, 16]. SCP successively convexifies the costs and constraints of a nonconvex optimal control problem, seeking a solution to the original problem through a series of convex problems [17, 18]. Examples include TrajOpt [1], Liu, et al. [2], and Mao, et al. [3]. However, these suffer a number of deficiencies, as summarized in Table I. For example, TrajOpt provides high speed and broad applicability to robotic systems, but the penalization of dynamical constraints and the missing development of convergence guarantees preclude exact feasibility and numerical robustness, respectively. Similarly, [2] only holds for a particular time-discretization, and though the approach in [3] is discretization-indepedent, it cannot ensure hard enforcement of dynamics. Further, its convergence analysis relies on complex Lagrange multipliers from which it is difficult to extract numerically useful information, a key capability exploited in part in our work. Finally, in most of these works, extensions to free final time and goal-set constraints are not addressed.

One last family of widespread procedures in trajectory optimization is variational methods. These include deterministic covariant approaches such as CHOMP [6] and probabilistic gradient descent approaches such as STOMP [7]. Similar to SCP, these do not necessarily require high-quality initializations, but theoretical guarantees are not easy to provide. Indeed, CHOMP does not incorporate the dynamical evolution of a system, while STOMP can only account for constraints through direct penalization, preventing hard enforcement of dynamics. Thus, convergence guarantees are not provided in both approaches.

S1pt

[3pt]Hard enforcement of dynamics constraints [3pt]Continuous-time convergence guarantees [3pt]Independent of time discretization [3pt]Free final time [3pt]Goal-set constraint [3pt]Provides dual solution that can warm-start shooting method
[3pt]Optimal SBP
[3pt]TrajOpt [1]
[3pt]Liu, et al. [2]
[3pt]Mao, et al. [3]
[3pt]CHOMP [6]
[3pt]STOMP [7]
[3pt]This Work
TABLE I: Comparison with existing trajectory optimization schemes.

Statement of Contributions: To begin to fill these gaps, our main contributions in this paper are as follows: First, we introduce Guaranteed Sequential Trajectory Optimization (GuSTO), an SCP-based algorithmic framework for trajectory optimization. More precisely, we provide a generalized continuous-time SCP scheme applied to drift control-affine nonlinear dynamical systems subject to control and state constraints (including collision-avoidance) and goal-set constraints, guaranteeing dynamic feasibility with either fixed or free final time. Second, we provide a theoretical analysis for this framework, proving that that the limiting solution of our continuous-time scheme is a stationary point in the sense of the Pontryagin Maximum Principle [13]. This generalizes the work in [3] for control-affine systems and introduces stronger theoretical guarantees than the current state-of-the-art. Moreover, the generality of our framework enables these guarantees to be independent of the chosen time discretization scheme and the method used to find a solution at each SCP iteration. Indeed, the framework is broadly applicable to many different robot motion planning and trajectory optimization problems, which then enjoy the same guarantees. This analysis is further leveraged to accelerate convergence by initializing shooting methods with the dual solutions of SCP iterations. Third, we provide practical guidelines based on our analysis, including proper handling of constraints and initialization strategies. Moreover, we demonstrate the framework through numerical and hardware experiments, provide comparison to other approaches, and provide a Julia library for our trajectory optimization framework.

To the best of our knowledge, our framework uniquely meets all three aforementioned desiderata, rapidly providing theoretically desirable trajectories for a broad range of robotic systems and problem setups (see Table I for a comparison to some existing approaches).

Ii Problem Formulation and Overview of SCP

We begin by reviewing the optimal control problem of interest in Section II-A and then provide an overview of an SCP framework for trajectory optimization in II-B.

Ii-a Trajectory Optimization as an Optimal Control Problem

Given a fixed initial point and a final goal set , for every final time , we model our dynamics as a drift control-affine system in of the form

(1)

where , are vector fields, and .

In this context, we design trajectory optimization as an optimal control problem with penalized state constraints. More specifically, we consider the Optimal Control Problem (OCP) consisting of minimizing the integral cost

(2)

under dynamics (1), among all the controls satisfying almost everywhere in . Here, , are , represents the norm that is given by a constant positive-definite matrix , and provides control constraints. The final time may be free or fixed, and hard enforcement of dynamical and goal-set constraints are naturally imposed by (1). Function sums up the contributions of the state-depending terms of the cost unrelated to constraints and of the sum of state constraint violations (e.g., collision-avoidance violation), where is a penalization weight. We stress that penalizing state constraints is fundamental to obtaining theoretical guarantees in the sense of the classical Pontryagin Maximum Principle [13] (see Theorem III.1), stronger than standard Lagrange multiplier rules. However, in Sec. III, we provide an algorithm under this formulation which can still enforce hard state constraints up to some chosen tolerance .

Ii-B Sequential Convex Programming

We proceed by applying sequential convex programming to solve our optimal control problem. Under the assumption that is convex, SCP consists of iteratively linearizing the nonlinear contributions of (OCP) around local solutions, thus recursively defining a sequence of simplified problems. More specifically, for a given , assume we have some continuous curve and some control law , continuously extended in the interval . Defined inductively, at iteration , the Linearized Optimal Control Problem (LOCP) consists of minimizing the new integral cost

(3)

where and is any smooth approximation of [19, Ch. 10], under the new dynamics

(4)

coming from the linearization of nonlinear vector fields, among all controls satisfying almost everywhere in , where is a solution for the linearized problem at the previous iteration, i.e. (LOCP), continuously extended in the interval . As a result of the functions , we can provide trust-region-type constraints on state trajectories using uniformly bounded scalars and weights (no such bounds are considered on controls because appears linearly), which at the same time penalize state constraints violations . Here, the user may make vary and at each iteration — these are used merely to ease the search for a solution of (LOCP). Problem (LOCP) is linearized around an initializing couple , and this initialization curve should be as close as possible to a feasible or even optimal curve for (LOCP), although we do not require that is feasible for (OCP).

The sequence of problems (LOCP) is correctly defined if, for each iteration , an optimal solution for (LOCP) exists. For this, we consider the following assumptions:

  • The set is compact and convex, while the set is a compact submanifold (either with or without boundary).

  • Mappings , , vector fields , and their differentials have compact supports.

  • At every iteration , problem (LOCP) is feasible. Moreover, for free final time problems, there exists a constant such that, every feasible tuple for (LOCP) satisfies , for every iteration .

Under these assumptions, classical existence Filippov-type arguments [20, 21] show that at each iteration , the problem (LOCP) has at least one optimal solution. Here, some comments are in order. Assumption is not limiting and can be easily satisfied by multiplying all noncompliant maps by smooth cut-off functions having supports contained in the working space. Moreover, it is standard in control theory to assume time-bounded strategies, and we can satisfy by simply considering the notion of virtual control [3]. Indeed, we stress the fact that most trajectory optimization applications effortlessly satisfy Assumptions -.

Iii GuSTO: Algorithm Overview
and Theoretical Analysis

In this section, we present the algorithmic details for GuSTO in III-A and discuss its theoretical convergence guarantees to a stationary point in III-B.

Iii-a Generalized SCP Algorithm

SCP aims to solve (OCP) by iteratively seeking solutions to (LOCP). For this process, the progression through iterations must be designed carefully in order to achieve numerical efficiency and fast computation. Supported by classical approaches [22] and more recent results in SCP for robot trajectory optimization [1, 3], we propose a new general SCP scheme, named GuSTO (Guaranteed Sequential Trajectory Optimization), to solve (OCP), as reported in Algorithm 1. Our main novelties are: 1) a time-continuous, broadly applicable setup ensuring convergence to a stationary point in the sense of the Pontryagin Maximum Principle [13] (see Corollary III.1), 2) hard enforcement of dynamical constraints and ease in considering free final time and goal-set problems, 3) a refined trust-region radius adaptation step based on a new model accuracy ratio which provides a definition of relative error between iterations and prevents the algorithm from becoming stuck in a cycle within its loops, 4) a theoretically justified stopping criterion based on closeness between iterated solutions.

Input : Trajectory and control defined in .
Output : Solution for (LOCP) at iteration .
Data : State constraints data , , ;     Trust region scaling parameters ,     , , .
1 begin
2      
3       while  and  do
4             Solve (LOCP) for
5             if  then
6                   Calculate model accuracy ratio in (5)
7                   if  then
8                         Reject solution
9                          ,  
10                        
11                  else
12                         Accept solution
13                            
14                            
15                        
16                  
17            else
18                   Reject solution
19                    ,  
20                  
21            
22            
23      return
24
Algorithm 1 GuSTO

Once (LOCP) is solved at some iteration (line 4), we first check whether hard trust-region constraints are satisfied. In the positive case, we evaluate the ratio

(5)

which represents the relative error between the original cost/dynamics and their convexified versions. If this error is greater than some given tolerance, the linear approximation is too coarse and we reject the new solution, shrinking the trust region (lines 7-9). Otherwise, we accept and update the trust region radius (lines 11-12 [3]). Moreover, in the case that hard-penalized state constraints are not satisfied, we increase the value of the weight (line 13), pushing the solver to seek constraint satisfaction (up to some threshold ) at the next iteration. On the other hand, when only soft trust-region constraints are satisfied, we increase the weight while maintaining the same radius (lines 15-16), pushing the solver to look for solutions that satisfy the trust-region constraints. The algorithm ends when successive iterations reach an identical solution or when the state constraint weight is greater than the maximum value .

Remark III.1.

As a result of Assumptions -, we have , where is some constant depending only on quantities defining (OCP). Moreover, since every solution satisfies the initial and final conditions for (1), it holds that . Therefore, since , it is easily seen that Algorithm 1 never becomes stuck in the rejection step provided by lines 7-9.

No assumption on the initializing strategy is taken. From a practical point of view, this allows us to initialize GuSTO with simple, even infeasible, guesses for solutions of (OCP), such as a straight line in the state space. In this case, a suitable choice of the maximal value of the trust region radius may be crucial to allow the method to correctly explore the space if the provided initialization is far from any optimal strategy. Finally, increasing the value of weights at line 3 of GuSTO eases the search for solutions satisfying state constraints up to the tolerance.

Iii-B Theoretical Convergence Guarantees

In this section, we prove that GuSTO has the property of guaranteed convergence to an extremal solution. This result is achieved by leveraging techniques from indirect methods in a direct method context, which is a contribution of independent interest discussed further in Section III-C.

The convergence of GuSTO can be inferred by adopting one further regularity assumption concerning (LOCP):

  • At every iteration of SCP, every optimal control of (LOCP) is continuous.

We stress that although Assumption seems limiting, many control systems in trajectory optimization applications naturally satisfy it [23]. Moreover, the normality of Pontryagin extremals is sufficient (under minor assumptions) to ensure that holds [24].

Thus, in view of the Pontryagin Maximum Principle [13], our main theoretical result for SCP is the following:

Theorem III.1.

Suppose that Assumptions - hold. Given any sequence of trust region radii and weights , let be any sequence such that for every , is optimal for (LOCP) in . Up to some subsequence:

  • , for the strong topology of

  • , for the strong topology of

  • , for the weak topology of

as tends to infinity, such that, is feasible for (OCP) in . Moreover, there exists a nontrivial couple such that the tuple represents a Pontryagin extremal for (OCP) in . In particular, as tends to infinity, up to some subsequence:

  • for the strong topology of

where is a Pontryagin extremal of (LOCP). Finally, for fixed final time problems, we have .

For sake of conciseness and continuity in the exposition, we report the proof of Theorem III.1 in the Appendix. The convergence of GuSTO to a stationary point, in the sense of the Pontryagin Maximum Principle, for (OCP) is quickly obtained as a corollary.

Corollary III.1.

Under -, in solving (OCP) by Algorithm 1, only three mutually exclusive situations arise:

  1. There exists an iteration for which . Then, Algorithm 1 terminates, providing a solution for (LOCP) that does not satisfy state constraints.

  2. There exists an iteration for which . Then, Algorithm 1 terminates, returning a stationary point, in the sense of the Pontryagin Maximum Principle, for (OCP).

  3. We have , for every iteration . Then, Algorithm 1 builds a sequence of optimal solutions for (LOCP) that has a subsequence converging (with respect to appropriate topologies) to a stationary point, in the sense of the Pontryagin Maximum Principle, for the original problem (OCP).

Proof.

Thanks to Remark III.1, it is clear that only these three cases may happen and that they are mutually exclusive. Then, we only need to consider cases 2) and 3). The latter follows from Theorem III.1. If Algorithm 1 falls into case 2), then by applying the Pontryagin Maximum Principle [13] to (LOCP), we have that is the desired (Pontryagin) stationary point for (OCP). ∎

Case 1) of Corollary III.1 represents a failure and means that we are not able to compute a feasible optimized strategy, either due to an infeasible problem or an infeasible initialization that could not be refined to feasibility. The same occurs when considering TrajOpt [1].

On the other hand, both cases 2) and 3) represent success. However, it is important to remark that from a practical point of view, because of numerical errors, when we begin satisfying some convergence criterion on (which is up to the user) while solving (OCP) by GuSTO, we usually fall into case 3) and rarely fall into case 2) of Corollary III.1. At this point, Theorem III.1 becomes crucial to ensuring that we are actually converging to a stationary point, in the sense of the Pontryagin Maximum Principle, for (OCP). This holds for the whole sequence of solutions , since it is itself a converging subsequence. In addition, these theoretical guarantees achieved in a continuous-time setting remain valid independent of the chosen discretization scheme, as sufficiently small time steps allow the discrete SCP solution to remain close to the solution of the original continuous-time problem (OCP).

Notice that a similar framework is considered in [3], in which, for an infinite number of iterations, one can only provide weak convergence up to some subsequence if the control constraint set is convex and compact. Indeed, this last assumption does not imply that is compact (e.g., take to be the closed unit ball [25]). In any case, the result provided by Theorem III.1 remains stronger because, unlike [3], we obtain strong convergence of both trajectories and Pontryagin extremals. This feature can be exploited to provide convergence acceleration, as demonstrated in the next section.

(a) Using controller-tracked straight-line initialization
(b) Using SOS planning initialization
Fig. 2: Comparing initialization strategies on an 8D airplane model for three different SCP algorithms.

Iii-C Accelerating Convergence using Shooting Methods

An important result provided by Theorem III.1 is the convergence of Pontryagin extremals related to the sequence of solutions to problems (LOCP) towards a Pontryagin extremal related to the solution of (OCP) found by GuSTO. In particular, we can use this result to accelerate convergence by warm-starting shooting methods [8] using the dual solution from each SCP iteration.

This can be shown as follows. Assuming that GuSTO is converging, the Lagrange multipliers related to the initial condition for the finite dimensional discretization of problems (LOCP) approximate the initial values of the adjoint vectors related to each (LOCP) [26]. Then, up to some subsequence, for every small , there exists an iteration for which, for every iteration , one has , where is an adjoint vector related to the solution of (OCP) found by SCP. This means that, starting from some iteration , we can run a shooting method to solve (OCP), initializing it by . Thus, at each iteration of GuSTO, we use the provided by the solver to initialize the shooting method until convergence. In practice, this method provides a principled approach to facilitate fast convergence of sequential convex programming towards a more precise solution.

Iv Numerical Experiments and Discussion

In this section, we provide implementation details and examples to demonstrate various facets of our approach.

Iv-a Implementation Details

We implemented the examples in this section in a trajectory optimization library written in Julia [27] and available at https://github.com/StanfordASL/GuSTO.jl. Computation times reported are from a Linux system equipped with a 4.3GHz processor and 32GB RAM. For each system and compared algorithm, we discretized the cost and dynamics of the continuous-time optimal control problem using a trapezoidal rule, assuming a zero-order hold for the control. The number of discretization points was set to 30-40 in the presented results. We used the Bullet Physics engine to calculate signed distance fields used for obstacle avoidance constraints [28, 29], and an SCP trial was marked as successful if the algorithm converged and the resulting solution was collision-free. In comparisons with Mao, et al. [3], since their approach does not address nonconvex state inequality constraints (e.g. collision avoidance), we chose to enforce linearized collision-avoidance constraints in their algorithm as hard inequality constraints.

Iv-B Batch Comparison using a Simple Initialization Scheme

In this section, we compared the GuSTO algorithm to previous SCP algorithms, TrajOpt [1] and Mao, et al. [3], for a 12D free-flying spacecraft robot model, within a cluttered mock-up of the International Space Station. For these dynamics, the state consists of position , velocity , the Modified Rodrigues parameters representation of attitude , and angular velocity [30]. Constraints for this system included norm bounds on speed, angular velocity, and control. We modeled the free-flyer robot parameters after the Astrobee robot [9], details of which can be found at [31]. The dynamics discretization error reported for an SCP solution was defined as . We ran 100 experiments with different start and goal states in the environment shown in Figure 1. Each trajectory was initialized with a simple straight line in position space and a geodesic path in rotation space and no control initialization. The results of our simulations are presented in Figure 3. For the set of simulations, both GuSTO and TrajOpt successfully returned solutions for 97% of the trials, though GuSTO on average performed faster and returned higher-quality solutions. Due to the simple initialization often being deep in collision, Mao, et al. had a high failure rate, and failure cases sometimes led to high computation times.

(a) Success percentage
(b) Optimal cost (on success)
(c) Dynamics discretization error
(d) Success computation times
(e) Failure computation times
Fig. 3: Normalized simulation results for the 3D free-flying robot simulation.

Iv-C Initialization Strategies

In practice, when a high-quality initialization trajectory (e.g. dynamically feasible, collision-free, close to a global optimal, etc.) is readily available, it should be used. However, an initial planner (even a coarse one) is often not available and may be expensive to design or time-consuming to run. Thus, we investigate the sensitivity of our approach to initialization, including very simple initialization schemes, again comparing with TrajOpt [1] and Mao, et al. [3].

In particular, in this section we ran simulations on an 8D airplane model, having dynamics as in [32]. To explore different initialization strategies, we leveraged a recent approach from [33]. This approach uses a lower-dimensional 4D planning model to generate a path to the goal, which is tracked by a controller to generate a dynamically-feasible trajectory for the full-dimensional system. By planning with tubes that account for model mismatch, the full 8D trajectory is guaranteed to be collision-free.

Using this work, we tested three initialization strategies of increasing quality for the 8D airplane: (1) a simple straight line in the 8D state space, (2) an 8D dynamically-feasible (but possibly in collision) trajectory generated using a controller to track a straight-line initialization in 4D, and (3) an 8D dynamically-feasible and collision-free trajectory recovered from running the full motion planning with model-mismatch tubes in 4D. As illustrated in Figure 2, the problem scenario consists of guiding the airplane to a terminal goal set across a cluttered environment.

For this system, due to the complex coupling in its dynamics, initialization (1) resulted in failure for all three SCP algorithms. The results of initialization (2) can be seen in Fig. 1(a), where GuSTO found a feasible solution, whereas Mao, et. al. returned a trajectory without satisfying convergence criteria, and TrajOpt resulted in collision. For the highest-quality initialization (3), Mao, et al. did not return a solution, whereas GuSTO and TrajOpt returned feasible trajectories, with run times of 0.55s and 2.67s and cost improvements over the initialization of 55% and 49%, respectively.

Iv-D Shooting Method Acceleration

To investigate the use of dual solutions from our SCP iterations to warm-start shooting methods and accelerate convergence, we ran simulations on a simple 3D Dubin’s car and the 12D free-flyer robot, where the free-flyer was placed in an obstacle-filled environment.

The acceleration technique gave very promising results, as shown in Table II. In practice, running a shooting method to completion, whether to convergence or to a maximum number of iterations, required negligible computation time compared to an SCP iteration ( ms). Thus, we attempted a shooting method at every iteration of SCP. As shown, using SCP to complete refinement was very time-consuming, whereas the shooting method would converge after just a few SCP iterations, thus reducing computation time for the car and free-flyer models by 52% and 74%, respectively.

[2pt]Dubin’s Car [2pt]Free-flyer Spacecraft
[2pt]SCP Only [2pt]SCP + Shooting [2pt]SCP Only [2pt]SCP + Shooting
[2pt]SCP Iterations 12 6 11 3
[2pt]Reported Cost 19.8 19.8 6.2 6.2
[2pt]Running Time 94 ms 45 ms 570 ms 146 ms
TABLE II: Results accelerating convergence by using SCP dual solution to warm-start a shooting method. SCP iterations report are the number required for convergence.

Iv-E Hardware Experiments

We also implemented GuSTO in the Stanford Space Robotics Facility on a free-flyer robot. The robot is a three-DoF, fully holonomic system and is equipped with eight thrusters and a reaction wheel, maneuvering on a frictionless surface. Experiments consisted of a spacecraft navigating a cluttered environment to berth with a capturing spacecraft. Video of the experiments can be found at https://youtu.be/GHehE-If5nY.

V Conclusions

In this paper, we provided a new generalized approach to solve trajectory optimization problems, based on sequential convex programming. We showed strong theoretical guarantees to ensure broad applicability to many different frameworks in motion planning and trajectory optimization. GuSTO was tested with numerical simulations and experiments showing that more accurate solutions are achieved faster than using recent state-of-the-art SCP solvers.

Future contributions will focus on additional theoretical guarantees. More precisely, we will study higher-order conditions that GuSTO should naturally provide, showing its convergence to more informative points than stationary points, as well as natural extensions of these guarantees to more general manifolds and systems evolving in spaces having a Lie group structure. Finally, GuSTO will be tested on high-DOF systems, such as robot arms and humanoid robots, and we will use the algorithm for hardware experiments on a free-flyer robot in a full microgravity environment.

Acknowledgements

We would like to thank Brian Coltin, Andrew Symington, and Trey Smith of the Intelligent Robotics Group at NASA Ames Research Center for their discussions during this work, as well as Sumeet Singh, Thomas Lew, and Tariq Zahroof for their assistance on experiment implementation.

References

  • [1] J. Schulman, Y. Duan, J. Ho, A. Lee, I. Awwal, H. Bradlow, J. Pan, S. Patil, K. Goldberg, and P. Abbeel, “Motion planning with sequential convex optimization and convex collision checking,” Int. Journal of Robotics Research, vol. 33, no. 9, pp. 1251–1270, 2014.
  • [2] X. Liu and P. Lu, “Solving nonconvex optimal control problems by convex optimization,” AIAA Journal of Guidance, Control, and Dynamics, vol. 37, no. 3, pp. 750 – 765, 2014.
  • [3] Y. Mao, M. Szmuk, and B. Açikmeşe, “Successive convexification of non-convex optimal control problems and its convergence properties,” in Proc. IEEE Conf. on Decision and Control, 2016.
  • [4] Q. T. Dinh and M. Diehl, “Local convergence of sequential convex programming for nonconvex optimization,” in Recent Advances in Optimization and its Applications in Engineering.   Springer, 2010.
  • [5] D. Verscheure, B. Demeulenaere, J. Swevers, J. De Schutter, and M. Diehl, “Time-optimal path tracking for robots: A convex optimization approach,” IEEE Transactions on Automatic Control, vol. 54, no. 10, pp. 2318–2327, 2009.
  • [6] N. Ratliff, M. Zucker, J. A. Bagnell, and S. Srinivasa, “CHOMP: Gradient optimization techniques for efficient motion planning,” in Proc. IEEE Conf. on Robotics and Automation, 2009.
  • [7] M. Kalakrishnan, S. Chitta, E. Theodorou, P. Pastor, and S. Schaal, “STOMP: Stochastic trajectory optimization for motion planning,” in Proc. IEEE Conf. on Robotics and Automation, 2011.
  • [8] J. T. Betts, “Survey of numerical methods for trajectory optimization,” AIAA Journal of Guidance, Control, and Dynamics, vol. 21, no. 2, pp. 193–207, 1998.
  • [9] T. Smith, J. Barlow, M. Bualat, T. Fong, C. Provencher, H. Sanchez, and E. Smith, “Astrobee: a new platform for free-flying robotics on the International Space Station,” in

    Int. Symp. on Artificial Intelligence, Robotics and Automation in Space

    , 2016.
  • [10] S. M. LaValle and J. J. Kuffner, “Rapidly-exploring random trees: Progress and prospects,” in Workshop on Algorithmic Foundations of Robotics, 2000.
  • [11] L. E. Kavraki, P. Švestka, J.-C. Latombe, and M. H. Overmars, “Probabilistic roadmaps for path planning in high-dimensional spaces,” IEEE Transactions on Robotics and Automation, vol. 12, no. 4, pp. 566–580, 1996.
  • [12] L. Janson, E. Schmerling, A. Clark, and M. Pavone, “Fast Marching Tree: a fast marching sampling-based method for optimal motion planning in many dimensions,” Int. Journal of Robotics Research, vol. 34, no. 7, pp. 883–921, 2015.
  • [13] L. S. Pontryagin, Mathematical Theory of Optimal Processes.   Taylor & Francis, 1987.
  • [14] F. Augugliaro, A. P. Schoellig, and R. D’Andrea, “Generation of collision-free trajectories for a quadrocopter fleet: A sequential convex programming approach,” in IEEE/RSJ Int. Conf. on Intelligent Robots & Systems, 2012.
  • [15] D. Morgan, S.-J. Chung, and F. Y. Hadaegh, “Model predictive control of swarms of spacecraft using sequential convex programming,” AIAA Journal of Guidance, Control, and Dynamics, vol. 37, no. 6, pp. 1725 – 1740, 2014.
  • [16] J. Virgili-llop, C. Zagaris, R. Zappulla II, A. Bradstreet, and M. Romano, “Convex optimization for proximity maneuvering of a spacecraft with a robotic manipulator,” in AIAA/AAS Space Flight Mechanics Meeting, 2017.
  • [17] C. Fleury and V. Braibant, “Structural optimization: a new dual method using mixed variables,” Int. Journal for Numerical Methods in Engineering, vol. 23, no. 3, pp. 409–428, 1986.
  • [18] S. Boyd and L. Vandenberghe, Convex optimization.   Cambridge Univ. Press, 2004.
  • [19] J. M. Lee, Introduction to Smooth Manifolds, 1st ed.   Springer, 2003.
  • [20] A. F. Filippov, “On certain questions in the theory of optimal control,” SIAM Journal on Control, vol. 1, no. 1, pp. 76–84, 1962.
  • [21] E. B. Lee and L. Markus, Foundations of Optimal Control Theory.   John Wiley & Sons, 1967.
  • [22] J. Nocedal and S. J. Wright, Numerical Optimization, 2nd ed.   Springer, 2006.
  • [23] Y. Chitour, F. Jean, and E. Trélat, “Singular trajectories of control-affine systems,” SIAM Journal on Control and Optimization, vol. 47, no. 2, pp. 1078–1095, 2008.
  • [24] I. A. Shvartsman and R. B. Vinter, “Regularity properties of optimal controls for problems with time-varying state and control constraints,” Nonlinear Analysis: Theory, Methods & Applications, vol. 65, no. 2, pp. 448–474, 2006.
  • [25] H. Brezis,

    Functional Analysis, Sobolev Spaces and Partial Differential Equations

    , 2011.
  • [26] L. Göllmann, D. Kern, and H. Maurer, “Optimal control problems with delays in state and control variables subject to mixed control-state constraints,” Optimal Control Applications and Methods, vol. 30, no. 4, pp. 341–365, 2008.
  • [27] J. Bezanson, S. Karpinski, V. B. Shah, and A. Edelman. (2012) Julia: A fast dynamic language for technical computing. Available at http://arxiv.org/abs/1209.5145.
  • [28] E. Coumans. Bullet physics. Available at http://bulletphysics.org/.
  • [29] C. Ericson, Real-Time Collision Detection.   CRC Press, 2004.
  • [30] G. S. Aoudé, “Two-stage path planning approach for designing multiple spacecraft reconfiguration maneuvers and applications to SPHERES onboard ISS,” Master’s thesis, Massachusetts Inst. of Technology, 2007.
  • [31] Astrobee robot software. NASA. Available at https://github.com/nasa/astrobee.
  • [32] R. W. Beard and T. W. McLain, Small Unmanned Aircraft: Theory and Practice.   Princeton Univ. Press, 2012.
  • [33] S. Singh, M. Chen, S. L. Herbert, C. J. Tomlin, and M. Pavone, “Robust tracking with model mismatch for fast and safe planning: an SOS optimization approach,” in Workshop on Algorithmic Foundations of Robotics, 2018, submitted.
  • [34] A. A. Agrachev and Y. Sachkov, Control Theory from the Geometric Viewpoint.   Springer, 2004.
  • [35] T. Haberkorn and E. Trélat, “Convergence results for smooth regularizations of hybrid nonlinear optimal control problems,” SIAM Journal on Control and Optimization, vol. 49, no. 4, pp. 1498–1522, 2011.
  • [36] R. Bonalli, “Optimal control of aerospace systems with control-state constraints and delays,” Ph.D. dissertation, Sorbonne Université & ONERA - The French Aerospace Lab, 2018.
  • [37] M. D. Shuster, “A survey of attitude representations,” Journal of the Astronautical Sciences, vol. 41, no. 4, pp. 439 – 517, 1993.
  • [38] S. M. LaValle, Planning Algorithms.   Cambridge Univ. Press, 2006.

Appendix A: Proof of Theorem iii.1

V-a Pontryagin Maximum Principle

Our theoretical result provides convergence of SCP procedures towards a quantity satisfying first-order necessary optimality conditions under the Pontryagin Maximum Principle [13]. Below, we report the Pontryagin Maximum Principle for time-varying problems, which is useful hereafter.

Theorem V.1 (Pontryagin Maximum Principle).

Let be an optimal trajectory for (OCP), associated with the control in . There exist a nonpositive scalar and an absolutely continuous function , called ßadjoint vector, with , and such that, almost everywhere in , the following relations hold:

  • Adjoint Equations

    (6)
  • Maximality Condition

    (7)
  • Transversality Conditions

    If is a submanifold of , locally around , then the adjoint vector can be built in order to satisfy

    (8)

    and, in addition, if the final time is free, one has

    (9)

Here, denotes the Hamiltonian related to (OCP) and the quantity is called (Pontryagin) extremal. We say that an extremal is normal if and is abnormal otherwise.

It is important to remark that Theorem V.1 provides more informative multipliers than those given by the Lagrange multiplier rule, because control constraints do not need to be penalized within the cost, the related Hamiltonian is globally maximized, and are merely continuous functions.

V-B Pontryagin Cone Analysis

We provide the proof of Theorem III.1 for the case of free final time problems, since, for fixed final time problems, the proof is similar but simpler (and quite straightforward, see below). The proof is based on the properties related to Pontryagin cones [13]. Therefore, we start by providing useful definitions and statements concerning these quantities.

Let be a feasible trajectory for (OCP), with associated control in . Throughout the proof, we assume that is a Lebesgue point of . Otherwise, one proceeds using limiting cones as done in [21]. For every Lebesgue point of and every , we define local variations as

(10)

The variation vector for (OCP) is the solution of the following variational system

(11)

At this step, for every , we define the Pontryagin cone at for related to (OCP) to be the smallest closed convex cone containing for every Lebesgue point of and every . Arguing by contradiction [34, 35], the Pontryagin Maximum Principle states that, if is optimal for (OCP), then there exists a nontrivial couple satisfying

(12)

Relations (6)-(9) derive from (12). In particular, a tuple is a Pontryagin extremal for (OCP) iff the nontrivial couple satisfies (12). However, a Pontryagin extremal is not necessarily a solution for (OCP).

Now, consider controls and , solutions of (LOCP) and of (LOCP) with final times and , respectively. If necessary and without loss of generality, thanks to Assumption we can continuously extend these controls to be constant in and , respectively. We apply the same procedure to trajectories and . Therefore, for every iteration , , are continuous functions in and for every and every , we are able to define local variations for (LOCP) as

(13)

and related variation vectors as the solutions of the following variational system:

(14)

Thus, from the above and due to the optimality of , the Pontryagin Maximum Principle states that, for every iteration of SCP, there exists a nontrivial couple satisfying

(15)

where the Pontryagin cone at for (LOCP) is defined as above, by substituting (10)-(11) with (13)-(14), and is the Hamiltonian related to problem (LOCP).

We now prove Theorem III.1 in two main steps:

V-B1 Convergence of Trajectories and Controls

First, consider the sequence of final times . Thanks to Assumption , there exists such that, up to some subsequence, converges to . As discussed previously, from now on, we consider every couple to be continuously defined in the time interval .

Next, consider the sequence . Thanks to Assumption , is bounded in . Moreover, the subset is closed and convex in for the strong topology, and then also for the weak topology [25]. Thanks to Assumption and reflexive properties for , there exists such that, up to some subsequence, converges to for the weak topology of [25].

Finally, we focus on the sequence . It is clear that Assumptions and provide that both and are bounded in . Therefore, is bounded in the Sobolev space . From reflexive properties, it follows that there exists such that, up to some subsequence, converges to for the weak topology of . Furthermore, since the inclusion is compact, converges to for the strong topology of [25]

For every integer , is feasible for (LOCP), and therefore (after the obvious extensions),

From this, by exploiting Assumptions , , and the previous convergences, it follows that is feasible for problem (OCP) (note that , since, up to some subsequence, the limit exists thanks to the compactness of , see Assumption ).

V-B2 Convergence of Multipliers

We now discuss the convergence to a Pontryagin extremal for (OCP). Assumption proves crucial to establishing the following Lemma:

Lemma V.1.

Suppose that Assumption holds. For every Lebesgue point of , there exists a sequence , for which is a Lebesgue point of and of , such that

as tends to infinity.

Proof.

We denote

Let us prove that, for every Lebesgue point of and for every , (such that ), there exists such that, for every satisfying , there exists a Lebesgue point of for which .

By contradiction, suppose that there exists , a Lebesgue point of , and , (with ) such that, for every , there exists with and for which, for Lebesgue point of , it holds that .

From the previous convergence results, the family converges to in for the weak topology. Therefore, for every , there exists an integer such that, for every , it holds that

for every . We exploit this fact to bound by . First, since is a Lebesgue point of , there exists such that

for every . On the other hand, from what was said previously, there exists an integer such that

for every and every . Finally, by Assumption , we have that is continuous for , and then, for every and every , there exists such that