A Modular Framework for Motion Planning using Safe-by-Design Motion Primitives

05/01/2019 ∙ by Marijan Vukosavljev, et al. ∙ 0

We present a modular framework for solving a motion planning problem among a group of robots. The proposed framework utilizes a finite set of low level motion primitives to generate motions in a gridded workspace. The constraints on allowable sequences of motion primitives are formalized through a maneuver automaton. At the high level, a control policy determines which motion primitive is executed in each box of the gridded workspace. We state general conditions on motion primitives to obtain provably correct behavior so that a library of safe-by-design motion primitives can be designed. The overall framework yields a highly robust design by utilizing feedback strategies at both the low and high levels. We provide specific designs for motion primitives and control policies suitable for multi-robot motion planning; the modularity of our approach enables one to independently customize the designs of each of these components. Our approach is experimentally validated on a group of quadrocopters.



There are no comments yet.


page 1

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

This paper presents a modular, hierarchical framework for motion planning and control of robotic systems. While motion planning has received a great deal of attention by many researchers, because the problem is highly complex especially when there are several robotic agents working together in a cluttered environment, significant challenges remain. Hierarchy, in which the control design has several layers, is an architectural strategy to overcome this complexity. Almost all hierarchical frameworks for motion planning aim to balance flexibility in the control specification at the high level, guarantees on correctness and safety at the low level, and computational feasibility overall.

Historically motion planning was focused on high level planning algorithms, while suppressing details on the dynamic capabilities of the robots at the low level [19]. Taking full account of low level dynamics in combination with solving the high level planning problem can lead to a computationally intractable problem. Despite the wealth of available research [9, 28, 3, 17], computationally efficient solutions to the motion planning problem with tight integration of high and low levels are highly sought after.

Fig. 1: Crazyflie quadrocopters navigate in a cluttered environment. Video results are available at http://tiny.cc/modular-3alg.

We propose a modular hierarchical framework so that one can independently plug and play both low level controllers and high level planning algorithms in order to realize a balance between flexibility at the high level, safety at the low level, and computational feasibility. To make a customizable approach feasible, we introduce three assumptions. First, the output space of the underlying dynamical system has translational symmetry, namely position invariance, a property satisfied by many robotic models [12]. Second, the output space is gridded uniformly into rectangular boxes. Finally, the control capabilities are discretized into a finite set of motion primitives, where the low level describes the implementation of the motion primitives while the high level selects the motion primitives. Together, these assumptions imply that motion primitives can be designed over a single box, so that they can then be reapplied to any other box.

Now we give an overview of the features and techniques we employ, and we highlight other frameworks that share those features. We provide general formulation of motion primitives for nonlinear systems so that they can be applied to multi-robot systems. We focus on reach-avoid specifications in a priori known environments, in which the system must reach a desired configuration in a safe manner [3, 9, 15, 20]. Reach-avoid offers a fairly rich behavior set so that, for instance, a fragment of linear temporal logic (LTL) can be encoded as a sequence of reach-avoid problems [33], as we also show in our applications.

As we have mentioned, we abstract the output space into rectangular regions [28] rather than more general polytopic regions [9, 11, 17, 20] in order to exploit symmetry. Motion primitives have been employed in various ways [28, 15] and we encode feasible sequences of motion primitives by a maneuver automaton [12]. In contrast to the motion primitive methods above, our implementation of the low-level control design of motion primitives is based on reach control theory[27, 4], which provides a highly flexible and intuitive set of design tools that have two notable advantages over tracking: first, it is not necessary to find feasible open-loop trajectories to track; second, safety constraints on the system states during the execution and concatenation of motion primitives can be guaranteed by design. Finally, planning at the high level is based on standard shortest path algorithms [19, 5] applied to the graph arising from the synchronous product of the discrete part of the maneuver automaton and the graph arising from the output space partition. The high-level plan generates a control policy, which selects the motion primitives over the gridded output space. The modularity of our approach enables one to employ other closed-loop methods such as potential methods [9]

or vector-field shaping

[20] for low-level control design, and standard or customized graph search algorithms to generate a high-level plan.

There are three main contributions of this work. First, we provide the complete theoretical details on the requirements for the low-level control design and high-level plan, and show that these two levels operate consistently to solve the reach-avoid problem. Second, we formulate the parallel composition of maneuver automata in order to obtain correct-by-design motion primitives for a system composed of individual subsystems, such as in the case of multiple vehicles. Finally, the modularity and effectiveness of our framework is experimentally validated on a group of quadrocopters in several illustrative scenarios. In particular, we feature a novel and versatile design of motion primitives based on double integrators and we show how the customizability of the high level plan generation can be used to easily trade-off solution quality with computational efficiency. This paper is an extension of our previous work [31], which now supplies all the theoretical details along with proofs on correctness, the parallel composition construction, additional approaches to generate control policies, and more elaborate experimental results.

The paper is organized as follows. In the next section we highlight our contributions relative to the literature. In Section III we present a formal problem statement. The modular framework is introduced in Section IV. We define the output transition system, the maneuver automaton, the product automaton, and the high level plan, each of which contribute to realizing a solution of the motion planning problem. In Section V, we prove that our overall methodology solves the motion planning problem. In Section VI we give the procedure for composing motion primitives. In Section VII we present specific motion primitives for a double integrator system. In Section VIII we consider several methods to generate high level plans, which are experimentally demonstrated on quadrocopters. We conclude the paper in Section IX.

Notation.  Let denote the integers and denote the real numbers. Let denote the cardinality of a set. If is a set, we denote its power set as . If and are sets, let denote the usual set difference. If there are sets , let denote the usual cartesian product. Given a function , the image of under and the preimage of under are defined the usual way, and are denoted as and , respectively. Let denote the convex hull of the vectors . Given two vectors , we denote the component-wise multiplication (or Hadamard product) as . Let denote the set of globally Lipschitz vector fields on .

Ii Related Literature

The literature on motion planning is vast and encompasses many research communities. As such, we have categorized some common approaches and discussed how they relate to our method.

Ii-a Graph Search and Trajectory Planning

Motion planning has often been addressed as a discrete planning problem, for which many standard graph search algorithms exist [19]. Recent work on the multi-agent reach-avoid problem has developed novel algorithms in the context of applications such as manufacturing and warehouse automation, aiming to balance computational efficiency with solution quality. For example, a centralized approach is given in [35]

, discretizing the workspace into a lattice and using integer linear programming to minimize the total time for robots to traverse in high densities. In

[10], a sampling-based roadmap is constructed in the joint robot space using individual robot roadmaps, which is shown to be asymptotically optimal. Prioritized planning enables to safely coordinate many vehicles and is considered in a centralized and decentralized fashion in [6]. Subdimensional expansion computes mainly decentrally, but coordinates in the joint search space when agents are neighboring [32]. While such approaches typically provide various theoretical guarantees on the proposed algorithms, dynamical models and application on real robotic systems is often not considered.

The modularity of our framework is complementary, as it potentially enables existing multi-agent literature on gridded workspaces to be used directly or adapted for the generation of a high-level plan when used in conjunction with our proposed formulation of motion primitives. However, the consideration of continuous time dynamics may complicate the application of discrete planning methods in two ways. First, we must contend with constraints on successive motion primitives so that the continuous time behavior is acceptable - for example, avoiding abrupt changes in velocity. Second, we must contend with non-deterministic transitions to neighboring boxes, because motion primitives may allow more than one next box to be reached [18] - for example, modeling the joint asynchronous motion capabilities of a multi-robot system.

Trajectory tracking methods have also been applied to the formation change problem on real vehicles with complex dynamics. A sequential convex programming approach is given in [2], which computes discretized, non-colliding positional trajectories for a modest number of quadrocopters. More recently, an impressive number of quadrocopters were coordinated in [25], by first computing a sequence of grid-based waypoints and then refining it into smoother piecewise polynomials. However, since these open-loop trajectories are computed offline, deviations from the computed trajectories could result in crashes. On the other hand, our approach is more robust as it is completely untimed, carefully monitoring the progress of vehicles over the grid in a reactive way based on the measured box transitions.

Ii-B Formal Methods

A growing body of research has explored the use of formal methods in motion planning. This paper has been particularly inspired by [17], which provides a general framework for solving control problems for affine systems with LTL specifications. Their approach involves constructing a transition system over a polyhedral partition of the state space that arises from linear inequality constraints that constitute the atomic propositions of the LTL specification. Transitions between states of the transition system can occur if there exists an affine or piecewise affine feedback steering all continuous time trajectories from one polyhedral region to a contiguous one. Similar works to [17] include [9, 14, 20], which consider the simpler reach-avoid problem. Single and multi-robot applications followed shortly after in [11] and [3] respectively.

The appeal of these approaches is derived from their generality and faithful account of the low level system capabilities. On the downside, these methods generally do not scale well to larger state space dimensions, and so they would have limited applicability to large multi-robot systems. Our approach specializes these ideas by exploiting symmetry in the system dynamics and grid partition in order to strike a better balance between generality and computational efficiency. In particular, our feedback controllers are given as motion primitives, which can be designed independently of the obstacle and goal locations.

More recent works have also built on these formal method approaches, investigating more complex and realistic multi-robot problems. For example, service requests by multiple car robots in a city-like environment with communication constraints was considered in [7]. A cooperative task for ground vehicles was addressed in a distributed manner, enabling knowledge sharing amongst neighbors and reconfiguration of the motion plan in real time [13]. Tasks such as picking up objects are considered in conjunction with motion requirements in [26]. Since these works consider only fairly simple vehicle dynamics, they place greater emphasis on the synthesis of discrete plans satisfying the task specification. On the other hand, this paper considers the simpler reach-avoid problem in order to develop a formulation of motion primitives for nonlinear systems with symmetries.

Ii-C Motion Primitives

The usage of motion primitives has become popular recently in robotics, as they serve to simplify the motion planning problem by using predefined executable motion segments. Many variations exist, which have designed motion primitives using timed reference trajectories to control a formation of quadrocopters [28], paths on a state space lattice for a mobile robots [24, 8], and funnels in the state space centered about a reference trajectory for a car [15] and a small airplane [22].

We have been inspired by ideas in [12], from which we borrowed the term “maneuver automaton”. They define a motion primitive either as an equivalence class of trajectories or a timed maneuver between two classes, whereas we define a motion primitive as a feedback controller over a polyhedral region in the state space. In our formulation, concatenations between motion primitives are possible only across contiguous boxes in the output space, which provides a strict safety guarantee during concatenation. Moreover, this enables our approach to simplify obstacle avoidance to a discrete planning problem over safe boxes as in [8], bypassing the need to concatenate motion primitive trajectories using numerical optimization techniques as in [12].

Our presentation of the maneuver automaton gives explicit constraints on the design of motion primitives so that they can used reliably for high level planning. We have also introduced the notion of parallel composition of maneuver automata to build motion primitives for multi-robot systems. While our construction resembles existing methods of parallel composition [34, 30], we additionally prove that our construction preserves desired properties that enable consistency between the low and high levels. To the authors’ best knowledge, this paper is the first rigorous treatment of feedback-based motion primitives defined on a uniformly gridded output space.

Iii Problem Statement

Consider the general nonlinear control system


where is the state, is the input, and is the output. Let and denote the state and output trajectories of (1) starting at initial condition and under some open-loop or feedback control.

Let be a feasible set in the output space and let be a goal set. In multi-vehicle motion planning contexts, represents the feasible joint output configurations of the system, which can arise from specifications involving obstacle avoidance, collision avoidance, communication constraints, and others. We consider the following problem.

Problem III.1 (Reach-Avoid).

We are given the system (1), a non-empty feasible set and a non-empty goal set . Find a feedback control and a set of initial conditions such that for each we have

  • Avoid: for all ,

  • Reach: there exists such that for all , .

We make an assumption regarding the outputs of the system (1) in order to exploit symmetry; see [12] for an exposition on nonlinear control systems with symmetries.

Assumption III.1.

First, we assume that there is an injective map associating each output to a distinct state, so that . We define the (injective) insertion map as , which satisfies and for all . Second, we assume that the system has a translational invariance with respect to its outputs. That is, for all , and , we have .  

The assumption that the outputs of the system are a subset of the states is used in our framework to be able to design feedback controllers in the full state space that achieve desirable behavior in the output space. The second statement says that the vector field is invariant to the value of the output. In the literature this condition is called a symmetry of the system or translational invariance. This assumption is satisfied for many robotic systems, for example, when the outputs are positions. Also, we will see in Section VII that it significantly simplifies our control design.

Iv Modular Framework

Fig. 2: Our modular framework consists of five modules.

In this section we present our methodology to solve the motion planning problem in the form of an architecture that breaks down Problem III.1. This architecture consists of five main modules, as depicted in Figure 2.

  • The Problem Data include the system (1) with outputs satisfying Assumption III.1 and a reach-avoid task to be executed in the output space.

  • The Output Transition System (OTS) is a directed graph whose nodes (called locations) represent -dimensional boxes on a gridded output space and whose edges describe which boxes in the output space are contiguous.

  • The Maneuver Automaton (MA) is a hybrid system whose modes correspond to so-called motion primitives. Each motion primitive is associated with a closed-loop vector field by applying a feedback law to (1). The edges of the MA represent feasible successive motion primitives. Each motion primitive generates some desired behavior of the output trajectories of the closed-loop system over a box in the output space. Because of the uniform gridding of the output space into boxes and because of the symmetry in the outputs described in Assumption III.1, motion primitives can be designed over only one canonical box .

  • The Product Automaton (PA) is a graph which is the synchronous product of the OTS and the discrete part of the MA. It represents the combined constraints on feasible motions in the output space and feasible successive motion primitives.

  • The Hybrid Control Strategy is a combination of low level controllers obtained from the design of motion primitives, and a high level plan on the product automaton.

Next we describe in greater detail the OTS, MA, and PA.

Iv-a Output Transition System

Fig. 3: A two output () example of a reach-avoid task. Shown on the left is the feasible space consisting of 15 non-obstacle boxes (not red) and the goal region (green). The output transition system (OTS), which abstracts the box regions and their neighbour connectivity, is shown on the right. Shown below, the possible offsets towards a neighbouring box are labelled using .

The OTS provides an abstract description of the workspace or output space associated with the system (1). It serves to capture the feasible motions of output trajectories of the system (1) in a gridded output space, as in Figure 3. Specifically, we partition the output space into -dimensional boxes with lengths , where is the length of -th edge. We use a finite number of boxes to under-approximate the feasible set . Enumerating the boxes as , the -th box can be expressed in the form , where , are constants. We require that . Among these boxes, we assume there is a non-empty set of indices , so that we may similarly under-approximate the goal region as . We define a canonical -dimensional box with edge lengths given by . Each box , is a translation of by an amount along the -th axis.

Definition IV.1.

Given the lengths and a non-empty goal index set , an output transition system (OTS) is a tuple with the following components:

State Space

is a finite set of nodes called locations. Each location is associated with a safe box in the output space and hence we write .


is a finite set of labels. A label is used to identify the offset between neighbouring boxes.


is a set of directed edges where if , , and . Thus, for each , the neighbouring box is either one box to the left (), the same box (), or one box to the right (). In this manner records the offset between contiguous boxes.

Final Condition

denotes the set of locations associated with goal boxes.

Remark IV.1.

We observe that the OTS is deterministic. That is, for a given and , there is at most one such that . This follows immediately from the fact that records the offset between the neighbouring boxes.

Figure 3 shows a sample OTS for a simple scenario. The OTS locations are associated with 15 feasible boxes, including a goal box for the reach-avoid task. The OTS edges are shown as bidirectional arrows; for example, interpreting and on the grid, then .

Iv-B Maneuver Automaton

The maneuver automaton (MA) is a hybrid system consisting of a finite automaton and continuous time dynamics in each discrete state. The discrete states of the finite automaton correspond to motion primitives, while transitions between discrete states correspond to the allowable transitions between motion primitives. The continuous time dynamics are given by closed-loop vector fields (1) with a control law designed based on reach control theory (any other feedback control design method can be used).

Before presenting the MA, we first explain how this module is used in the overall framework. To solve Problem III.1, we assign motion primitives to the boxes of the partitioned output space such that obstacle regions are avoided and the goal region is eventually reached. The discrete part of the MA encodes the constraints on successive motion primitives. Such constraints might arise from a non-chattering requirement, continuity requirement, or requirement on correct switching between regions of the state space. A dynamic programming algorithm for assignment of motion primitives on boxes is addressed in Section IV-D; the salient point about this algorithm at this stage is that it only uses the discrete part of the MA.

In contrast, the continuous time part of the MA is used both for simulation of the closed-loop dynamics to verify that the motion primitives are well designed, as well as for the implementation of the low level feedback in real-time. The motion primitives are defined only on the canonical box to simplify the design. This simplification is possible because of the translational symmetry provided by Assumption III.1 and the fact that each box is a translation of . In simulation, a given motion primitive can cause output trajectories to reach certain faces of . If a face is reached, the output trajectory is interpreted as being reset to the opposite face and another motion primitive is selected to be implemented over (of course, the real experimental output trajectories do not undergo resets but move continuously from box to box in the output space). The selection of the next motion primitive is constrained by a combination of the previous motion primitive and the face of that is reached. The discrete transitions in the MA encode these constraints.

Definition IV.2.

Consider the system (1) satisfying Assumption III.1 and the box with lengths . The maneuver automaton (MA) is a tuple , where

State Space

is the hybrid state space, where is a finite set of nodes, each corresponding to a motion primitive.


, the same labels used in the OTS.


is a finite set of edges.

Vector Fields

is a function assigning a globally Lipschitz closed-loop vector field to each motion primitive . That is, for each , we have where is a feedback controller associated with .


assigns a bounded invariant set to each . We impose that . The set defines the region on which the vector field is defined. Note that there is no requirement that the invariant is a closed set.

Enabling Conditions

assigns to each edge , a non-empty enabling or guard condition . We require that . We make an additional requirement that lies on a certain face of determined by the label . Defining the face associated with as

we require that also .

Reset Conditions

assigns to each edge a reset map . We require that , where is the Hadamard product. This definition says that the -th output component is reset to the right face of , , if , reset to the left face if , and unchanged otherwise. Overall, resets of states are determined by the event and only affect the output coordinates in order to maintain output trajectories inside the canonical box .

Initial Conditions

is the set of initial conditions given by .

Example IV.1.

Suppose the system is a double integrator and the first state is the translationally invariant output . The box is simply a segment. Let , where Hold () stabilizes , Forward () increases , and Backward () decreases . Referring to Figure 4, if is the current motion primitive and reaches the right face of , then the event occurs and we may select or as the next motion primitive. To correctly implement the discrete evolution of the MA in the continuous state space, an invariant and feedback control must be associated with each motion primitive, while an enabling and reset condition must be associated with each edge; see Figure 5. Formal details are given in Section VII.

Fig. 4: The maneuver automaton edges for the double integrator dynamics with . There are three motion primitives: Hold (), Forward (), and Backward ().

Fig. 5: The closed-loop vector fields in the position-velocity state space for the Hold, Forward, and Backward motion primitives.

We now formulate assumptions on the motion primitives so that correct continuous time behavior is ensured at the low level for consistency with the high level. For each , define the set of possible events as

Assumption IV.1.
  • For all , .

  • For all such that and , .

  • For all such that and , if , then .

  • For all such that and , .

  • For all , .

  • For all , if then for all and , .

  • For all , if , then for all there exist (a unique) and (a unique) such that for all and for all , and .

Condition (i) disallows tautological chattering behavior that arises by erroneously interpreting continuous evolution of trajectories in the interior of as “discrete transitions” of the MA (see Section V for definitions). Condition (ii) imposes that guard sets are independent of the next motion primitive. Since guard sets arise as the set of exit points of closed-loop trajectories from under a given motion primitive, it is reasonable that these exit points should depend only on the current motion primitive , and not on the choice of next motion primitive. Condition (iii) imposes that all guard sets corresponding to different labels are non-overlapping. This ensures that when the continuous trajectory reaches a guard , then it is unambiguous which edge of the MA is taken next; namely . Conditions (v), (vi), and (vii) are placed to guarantee that the MA is non-blocking. These conditions are based on known results in the literature [21]; see Lemma V.1. In order for condition (vii) to make sense, there must exist a unique label and a unique time for an MA trajectory to reach a guard set. First, we have uniqueness of solutions since the vector fields are globally Lipschitz. Second, the unique MA trajectory can only reach one guard set by condition (iii); this in turn means there is a unique . Obviously there exists a unique time to reach the guard set. Conditions (vi) and (vii) work together to state that either all trajectories do not leave, or all trajectories do eventually leave. Referring to Figure 5, all closed-loop state trajectories within the invariant of reach the guard set shown in green on the right. For either choice of next feasible motion primitive, or , trajectories enter the next invariant on the left due to the reset. Finally, condition (iv) eliminates potential chattering Zeno behavior, see Remark V.1.

Remark IV.2.

We make several further observations about the MA.

(i) The MA is non-deterministic in the sense that given and , there may be multiple such that . The discrete part of the MA is non-deterministic in a second sense: for each , the cardinality of the set may be greater than one. The latter situation corresponds to the fact that for different initial conditions of the continuous part, the associated output trajectories can reach different guard sets. In essence, which guard is enabled is interpreted, at the high level, as an uncontrollable event [34]. Remark IV.3 further illustrates these two types of non-determinism in the case of the PA.

(ii) The set of events in the MA correspond to the same events in the OTS. This correspondence is used in the product automaton PA, described in the next section, to synchronize transitions in the MA with transitions in the OTS. The interpretation is that when a continuous trajectory of the MA (over the box ) undergoes a reset with the label , the associated continuous trajectory of (1) in the box enters a neighboring box with the offset . Obviously, this interpretation assumes that the vector of box lengths is the same in both OTS and MA.  

Iv-C Product Automaton

In this section we introduce the product automaton (PA). It is constructed as the synchronous product of the OTS and the discrete part of the MA, namely . The purpose of the PA is to merge the constraints on successive motion primitives with the constraints on transitions in the OTS in order to enforce feasible and safe motions. As such, it captures the overall feasible motions of the system – any high level plan must adhere to these feasible motions.

Definition IV.3.

We are given an OTS and an MA satisfying Assumption IV.1. We define the product automaton (PA) to be the tuple , where

State Space

is a finite set of PA states. A PA state satisfies the following: if there exists and such that , then there exists such that . That is, if all faces that can be reached by motion primitive lead to a neighboring box of the box associated with location of the OTS.


is the same set of labels used by the OTS and the MA.


is a set of directed edges defined according to the following rule. Let , , and . If and , then .

Final Condition

is the set of final PA states.

Remark IV.3.

Formally an automaton is said to be non-deterministic if there exists a state with more than one outgoing edge with the same label. The PA is non-deterministic. First, consider a PA state . Because the MA allows for more than one feasible next motion primitive such that , the PA will also have multiple next PA states such that . Second, there can be multiple possible labels such that for some . Thus, the PA inherits the two types of non-determinism of the MA that we discussed in Remark IV.2. For example, consider the PA fragment in Figure 6. For the first type of non-determinism, observe that there are two PA edges and with the same label. For the second type, observe that there are two possible events from , each with its own set of PA edges. Note also some additional structure: since the OTS is deterministic, the box state is in both and , corresponding to the OTS edge .  

Fig. 6: A fragment of a generic PA, showing a state and its neighbours.

Iv-D High-Level Plan

In this section we formulate the notion of a control policy on the PA, which gives a rule for selecting subsequent PA states by choosing the next motion primitive. Informally, the objective of the high level plan is to produce a control policy and find a set of initial PA states such that a goal PA state is eventually reached. To this end, in this section we also develop a Dynamic Programming Principle (DPP) suitable for use on the PA. Because of the two types of non-determinism of the PA, existing algorithms cannot be applied directly [5, 33]. By adapting the algorithm in [5], we obtain two formulations of the DPP, one of which is more computationally efficient as it exploits certain structure in the PA; further details are provided in Remark IV.5.

First some notation will be useful. Recall from (2), given , is the set of all labels on outgoing edges starting at . Similarly, is the set of all labels on outgoing edges starting at . That is,

Now we formalize the semantics of the PA. A state of the PA is a pair where is a location in the OTS and is a motion primitive. A run of is a finite or infinite sequence of states , with and for each , there exists such that . We define the length of a run to be ; for infinite runs is defined to be . We consider a subset of runs starting at that satisfy one further property. If the run is infinite, then if . Instead if the run is finite, then if and additionally, . It is the latter requirement – that the last PA state of a finite run may not have outgoing edges in the PA – which is of interest. The interpretation is that we regard the event labels between PA states as uncontrollable, so if any event is possible, then it must occur eventually. Thus without loss of generality, each run is the prefix of a run . Further elaboration is given in Remark IV.4 (ii).

Given and , the set of admissible motion primitives is

More generally, given and , the set of admissible motion primitives at is

Next we introduce the notion of a control policy. Given and , an admissible control assignment at is a vector

where , or equivalently . Notice that is a vector whose dimension varies as a function of the cardinality of the set . An admissible control policy is a map that assigns an admissible control assignment at each . Thus, for each and , . The set of all admissible control policies is denoted by .

Consider an admissible control policy and a state . We denote the set of runs in induced by as . Formally, if , and for all and , . Similarly, we denote the subset of runs in that eventually reach a state in as . Formally, if there exists an integer such that . For , we define

Next we define an instantaneous cost , which satisfies for all , and a terminal cost . Now consider the run with , , and . We define a cost-to-go by

Remark IV.4.

There are several notable features of our formula for the cost-to-go.

(i) For a given , there may be multiple runs due to the (second, non-standard type of) non-determinism of the PA. As such, we assume the worst case and take the maximum over in the cost-to-go. Moreover, we require for a finite cost-to-go so that is well-defined and all runs starting at eventually reach .

(ii) We have assumed that finite runs must terminate on PA states that have no outgoing edges. Suppose we included in finite prefixes of (finite or infinite) runs. These necessarily would be finite runs with final PA states that have outgoing edges. Then if we take a finite or infinite run that eventually reaches a goal PA state, certain finite prefixes of that run may not yet have reached a goal PA state, and we would get and an infinite cost-to-go. This anomaly arises from creating an artificial situation in which not all runs starting at an initial PA state reach a goal PA state because we included (unsuccessful) finite prefixes of successful runs.

(iii) The cost-to-go function also accounts for infinite runs by using the variable to record the first time a goal PA state is reached and by taking the cost only over the associated prefix of the infinite run. Although our primary focus is on reach-avoid specifications, in which finite runs terminate on goal PA states with no outgoing edges, infinite runs allow us to extend our framework to a fragment of LTL where, for example, a goal PA state is reached always eventually; see Remark V.3 for further details.

Fig. 7: At the top, a PA is depicted for a single output system having three motion primitives over three boxes . The numbers on the edges (shown as arrows) are the corresponding labels. The bottom pictures show the reduced set of transitions induced by control polices .
Example IV.2.

Consider the PA shown at the top of Figure 7 corresponding to a single output system with the three motion primitives from Example IV.1 over three boxes . Suppose that for all and that for all .

First consider the feasible control policy with the control assignments: , , , and . The bottom left of Figure 7 shows how the control policy trims away possible edges in the PA. Now suppose that . Choosing the initial condition and under the assumption that we do not include finite runs that terminate at PA states with outgoing edges, we can see that consists of only the single infinite run . Even though this run is infinite, , , and . Similarly, we compute , , , and . In contrast, the feasible control policy shown on the bottom right of Figure 7 only contains finite runs.  

Next we define the value function to be

The value function satisfies a dynamic programming principle (DPP) that takes into account the non-determinacy of ; see [5] where a slightly different result is proved. The proof is found in the appendix.

Theorem IV.1.

Consider and suppose . Then satisfies


where , , , and .

Notice that for all such that , (since there can be no paths to the goal). Also, for all , .

Remark IV.5.

In (3) of Theorem IV.1, it is shown that can be computed using the local information of instead of using all of . In (4), the result is taken one step further by showing that can be calculated using only for each . The benefit of (4) becomes clear when we compare the cardinality of the sets over which the minimizations occur. Given , let . In (3) the minimization is over , and therefore the cardinality of the minimization set is . In (4) the minimization is over for each , and therefore the cardinality of the set is . While both (3) and (4) can be used to compute , in general (4) will be more computationally efficient.

Corollary IV.1.

Consider the control policy such that for all , and

where , and . Then is an optimal control policy such that for all , .

Figure 8 shows a possible control policy for the scenario in Figure 3. Since there are two outputs, we use the motion primitives from Example IV.1 in each output; formal details are given in Section VI. The control policy was hand-computed. Notice that different routes may be taken from the same product state depending on the face reached, but ultimately the control policy ensures that all paths lead to the goal.

Fig. 8: This figure shows a discrete control strategy for the scenario shown in Figure 3.

V Main Results

In this section we present our main results on a solution to Problem III.1. Our final result combines the notion of a control policy at the high level with feedback controllers executing correct continuous time behavior at the low level. First, in accordance with the reach-avoid objective (see Remark IV.4 (iii)), we assume the existence of motion primitives that can stabilize trajectories within a given box, that is, there exists such that . We restrict the final PA states to be goal OTS states equipped with such motion primitives


Now suppose we have an admissible control policy derived using Theorem IV.1 or otherwise with as above. We present a complete solution to Problem III.1 including an initial condition set , a feedback control , and conditions on the motion primitives so that the reach-avoid specifications of Problem III.1 are met.

First we specify the initial condition set . The set of feasible initial PA states is


That is, a feasible initial PA state satisfies that every run (induced by the control policy) starting at the PA state eventually reaches a goal PA state.

Now consider a state . It can be used as an initial state of the system if there is some for which the state is both in the box and in the invariant of . Recall that for all and ,