Coordination strategies for multi-agent systems (MAS) have been traditionally designed under the assumption that state feedback is continuously available. However, continuous communication over a network is often impractical, especially in mobile robot applications where shadowing and fading in the wireless communication can cause unreliability, and each agent has limited energy resources [goldsmith2005wireless, Bo_ADHS].
Due to these constraints, there is a strong interest in developing MAS coordination methods that rely on intermittent information over a communication network. The results in [Wang.Lemmon2009, Meng.Chen2013, Cheng.Kan.ea2017, Li.Liao.ea2015, Heemels.Donkers2013, tabuada2007event] develop event-triggered and self-triggered controllers that utilize sampled data from networked agents only when triggered by conditions that ensure desired stability and performance properties. However, these results require a network represented by a strongly connected graph to enable agent coordination. This requirement of a strongly connected network induces constraints on the motion of the individual agents and additional maneuvers that may deviate from their primary purpose. Event-triggered and self-triggered control methods can also be used to coordinate the agents that communicate with a central base station or cloud intermittently as in [Nowzari_Pappas2016], where submarines intermittently surface to obtain state information about themselves and their neighbors from a cloud. However, such a coordination strategy also requires additional maneuvers from the submarines that detract from their primary purpose.
Depending on the application and/or environment, some of the agents in a MAS may not be equipped with absolute position sensors. In such scenarios, the results in [Wang.Lemmon2009, Meng.Chen2013, Cheng.Kan.ea2017, Li.Liao.ea2015, Heemels.Donkers2013, tabuada2007event] are invalid. Therefore, there is a need for distributed methods capable of coordinating these agents that are not equipped with absolute position sensors while utilizing intermittent information. Moreover, such methods should not require agents to perform additional maneuvers to ensure the connectivity of the network. In [Zegers.Chen.ea2019], a set of followers operating with inaccurate position sensors are able to reach consensus at a desired state while a leader intermittently provides each follower with state information. By introducing a leader, the followers are able to perform their tasks without the need to perform additional maneuvers to obtain state information.
Building on the work of [Zegers.Chen.ea2019], we adopt a leader-follower scheme, where the MAS is modeled as a switched system [Wu2019SwitchedLS, Bo2018CSL]. As an illustrative example shown in Fig. 1, the three followers need to reach consensus at the center of the green feedback region and one leader agent is to provide intermittent state information to each follower. To guarantee the stability of the switched system and consensus of the followers, we derive maximum and minimal dwell-time conditions to constrain the intervals between consecutive time instants at which the leader should provide state information to the same follower.
The maximum and minimum dwell-time conditions can be encoded by metric temporal logic (MTL) specifications [Ouaknine2005]. Such specifications have also been used in many robotic applications for time-related specifications [zhe_ijcai2019]. Furthermore, as the leader is typically more energy-consuming and safety-critical due to the high-quality sensing, communication and mobility equipments, the leader is likely required to satisfy additional MTL specifications for practical constraints such as charging its battery and staying in specific regions. In the example shown in Fig. 1, the leader needs to satisfy an MTL specification “reach the charging station or in every 6 time units and always stay in the yellow region ”.
We design the followers’ controllers such that guarantees on the stability of the switched system and consensus of the followers hold, provided that the maximum and minimal dwell-time conditions are satisfied. Then we synthesize the leader’s controller to satisfy the same MTL specifications that encode the maximum and minimal dwell-time conditions and the additional practical constraints. There is a rich literature on controller synthesis subject to temporal logic specifications [KHFP, Nok2012, BluSTL, sayan2016, Zhiyu2017ACC, Zhiyu2017CDC, Bo2019, zhe_advisory, zheACC, zhe_control, zheACC2018]
. For linear or switched linear systems, the controller synthesis problem can be converted into a mixed-integer linear programming (MILP) problem[BluSTL, sayan2016]
. Additionally, as the followers are not equipped with absolute position sensors, we design an observer to estimate the followers’ states and the state estimates can change abruptly due to the intermittent communication of state information. Therefore, we solve the MILP problem iteratively to account for such abrupt changes.
We provide an implementation of the proposed method on a simulation case study with three mobile robots as the followers and one quadrotor as the leader. The results in two different scenarios show that the synthesized controller can lead to satisfaction of the MTL specifications, while achieving the stability of the switched system and consensus of the followers.
Ii Background and Problem Formulation
Ii-a Agent Dynamics
Consider a multi-agent system (MAS) consisting of followers (111 denotes the set of positive integers.) index by and a leader indexed by . Let the time set be . Let denote the position of the leader and follower , respectively. Let and denote the state of the leader and follower , respectively. The linear time-invariant dynamics of the leader and follower are
where , , . Here, denote the control inputs of the leader and follower , respectively, and is an exogenous disturbance. For simplicity, we assume that and has full row rank.
Ii-B Sensing and Communication
Each follower is equipped with a relative position sensor and hardware to enable communication with the leader. Since the followers lack absolute position sensors, they are not able to localize themselves within the global coordinate system. Nevertheless, the followers can use their relative position sensors to enable self-localization relative to their initially known locations. However, relative position sensors like encoders and inertial measurement units (IMUs) can produce unreliable position information since e.g., wheels of mobile robots may slip and IMUs may generate noisy data. Hence, the term in (1) models the inaccurate position measurements from the relative position sensor of follower as well as any external influences from the environment. Navigation through the use of a relative position sensor results in dead-reckoning, which becomes increasingly more inaccurate with time if not corrected. On the other hand, the leader is equipped with an absolute position sensor and hardware to enable communication with each follower. Unlike a relative position sensor, an absolute position sensor allows localization of the agents within the global coordinate system.
The followers’ task is to reach consensus to a predetermined state . A feedback region (see Fig. 1) centered at the position with radius is capable of providing state information to each follower once . The leader’s task is to provide state information to each follower while they navigate to with the intermittent state information. Both the leader and the followers are equipped with digital communication hardware where communication is only possible at discrete time instants. Let and denote the communication and sensing radii of each agent, respectively. For simplicity, let .
The leader provides state information to the follower (i.e., services the follower ) if and only if and the communication channel of the follower is on. We define the communication switching signal for follower as if the communication channel is on for follower ; and if the communication channel is off for follower . We use to indicate the servicing time instance for follower . Hence, the servicing time instant for follower is333For is the initial time, for simplicity we take
where denotes the conjunction logical connective.
Ii-C State Observer and Error Dynamics
The followers, not equipped with absolute position sensors, implement the following model-based observer to estimate the state of each follower :
where denotes the estimate of .
Then we can obtain the position estimate of follower as
To facilitate the analysis, we define the following two error signals
Similar to [Zegers.Chen.ea2019], we adopt the following assumptions.
The state estimate is initialized as for all .
The leader has full knowledge of its own state for all and the initial state for all .
The disturbance is bounded, i.e., for all , where is a known constant.
The control of follower is as follows:
such that denotes the pseudo-inverse of and is a user-defined parameter. Since has full row rank (see Section II-A), , where
is the identity matrix.
At each servicing time instant , with the feedback provided by the leader, the state estimate of follower immediately resets to . Therefore, the state estimates follow the dynamics of switched systems[Xuping].
is the zero column vector. Substituting (2) into the time-derivative of (5) yields
Ii-D Metric Temporal Logic (MTL)
To achieve the stability of the swicthed system and consensus of the followers while satisfying the practical constraints of the leader, the requirements of the MAS can be specified in MTL specifications (see details in Section IV). In this subsection, we briefly review the MTL interpreted over discrete-time trajectories [FainekosMTL]. The domain of the position of the agents is denoted by . The domain is the Boolean domain, and the time index set is . With slight abuse of notation, we use to denote the discrete-time trajectory as a function from to . A set is a set of atomic propositions, each mapping to . The syntax of MTL is defined recursively as follows:
where stands for the Boolean constant True, is an atomic proposition, (negation), (conjunction), (disjunction) are standard Boolean connectives, is a temporal operator representing “until” and is a time interval of the form (, ). We can also derive two useful temporal operators from “until” (), which are “eventually” and “always” . We define the set of states that satisfy the atomic proposition as .
Next, we introduce the Boolean semantics of MTL for trajectories of finite length in the strong and the weak view, which are modified from the literature of temporal logic model checking and monitoring [Eisner2003, KupfermanVardi2001, Ho2014]. We use to denote the time instant at time index and to denote the value of at time . In the following, (resp. ) means the trajectory strongly (resp. weakly) satisfies at time index , (resp. ) means fails to strongly (resp. weakly) satisfy at time index .
The Boolean semantics of MTL for trajectories of finite length in the strong view is defined recursively as follows [zhe_advisory]:
The Boolean semantics of MTL for trajectories of finite length in the weak view is defined recursively as follows [zhe_advisory]:
Intuitively, if a trajectory of finite length can be extended to infinite length, then the strong view indicates that the truth value of the formula on the infinite-length trajectory is already “determined” on the trajectory of finite length, while the weak view indicates that it may not be “determined” yet [Ho2014]. As an example, a trajectory is not possible to strongly satisfy at time 0, but is possible to strongly violate at time 0, i.e., is possible.
For an MTL formula , the necessary length is defined recursively as follows [Maler2004]:
Ii-E Problem Statement
We now present the problem formulation for the control of the MAS with intermittent communication and MTL specifications.
Design the control inputs for the leader ( denotes the control input at time index ) such that the following characteristics are satisfied while minimizing the control effort 444 denotes the 2-norm.:
Correctness: A given MTL specification is weakly satisfied by the trajectory of the leader.
Stability: The error signal is uniformly bounded, and the error signal is asymptotically regulated555The error signal is asymptotically regulated if as for each follower .
Consensus: The states of the followers asymptotically reach consensus to .
Iii Stability and Consensus Analysis
In this section, we provide the conditions for achieving the stability of the switched system and the consensus of the followers. Such conditions include maximal (see Theorem 1) and minimal (see Theorem 2) dwell-time conditions on the intervals between consecutive time instants at which the leader should provide state information to the same follower.
Let be a user-defined parameter. Then, the error signal in (4) for follower is uniformly bounded, i.e., for all , provided the leader satisfies the maximum dwell-time condition
for all .
Let Consider the common Lyapunov functional candidate
Invoking the Comparison Lemma [Khalil, Lemma 3.4] on (14) over yields
Substituting (12) into (15) yields Now, define by Since for all and where then for all If then for all Hence, the corresponding dwell-time condition is given by (11). Since where and over each provided the leader continuously satisfies the dwell-time condition in (11), then for all .
Suppose the leader satisfies the dwell-time condition in (11) for all . Consider the common Lyapunov functional
Observe that is finite since is a measured quantity provided by the leader where (20) implies is bounded over Moreover, the RHS of follower dynamics in (1) are Lebesgue measurable and locally essentially bounded. Therefore, there exists a Filippov solution that is absolutely continuous over Now, consider
The jump discontinuity ofat is given by where is defined by (10) and denotes the limit of as from the left. Since and is continuous over , then by Theorem 1 . It then follows that the magnitude of the jump discontinuity is bounded by
Since is strictly decreasing over by (20), then for all The reset map in (2) may induce an instantaneous growth in (5) at where (21) implies Therefore, the minimum dwell-time condition given by (16) can ensure that , which is valid when . Observe that there exists some such that . Provided the leader satisfies the maximum dwell-time condition in (11) for all , then Hence, by selecting , it follows that , and follower will be inside the feedback region after . Moreover, and for all . Thus, as Since (17) does not have a restricted domain and is radially unbounded, then the stability result is global.
The proof of Theorem 2 formally excludes Zeno behavior.
Let . By Theorem 1, if the maximum dwell-time condition in (11) is satisfied, then for all . By Theorem 2, if the minimum dwell-time condition in (16) is satisfied or all (), then there exists a time such that . Therefore, as . Then for , follower will be inside the feedback region where . Moreover, as , so as .
Iv Controller Synthesis with Intermittent Communication and MTL Specifications
In this section, we provide the framework and algorithms for controller synthesis of the leader to satisfy the maximum and minimal dwell-time conditions and the practical constraints. The controller synthesis is conducted iteratively as the state estimates for the followers are reset to the true state values whenever they are serviced by the leader, and thus the control inputs need to be recomputed with the reset values.
We assume that the communication is only possible at discrete time instants, with time periods apart and controlled by the communication switching signal . We define the discrete time set , where for . The maximum dwell-time in (11) for robot is in the interval and the minimum dwell-time in (16) is in the interval . We use the following MTL specifications for encoding the maximum dwell-time condition and the minimum dwell-time condition ( is a user-defined parameter):
where means “for any follower , the leader needs to be within distance from the estimated position of the follower at least once in any time periods”, and means “each time the leader is within distance from the estimated position of the follower , it should not be within distance from the estimated position of the follower again for the next time periods”.
The leader also needs to satisfy an MTL specification for the practical constraints. One example of is as follows:
which means “the leader robot needs to reach the charging station or at least once in any time periods, and it should always remain in the region ”.
Combining , and , the MTL specification for the leader is .
We use to denote the formula modified from the MTL formula when is evaluated at time index and the current time index is . can be calculated recursively as follows (we use to denote the atomic predicate evaluated at time index ):
If the MTL formula is evaluated at the initial time index (which is the usual case when the task starts at the initial time), then the modified formula is .
Algorithm 1 shows the controller synthesis approach with intermittent communication and MTL specifications. The controller synthesis problem can be formulated as a sequence of mixed integer linear programming (MILP) problems, denoted as MILP-sol in Line 3 and expressed as follows:
where the time index is initially set as 0, is the number of time instants in the control horizon, , is the control input signal of the leader, the input values are constrained to , , , , , and are converted from , , , , and respectively for the discrete-time state-space representation, and are follower control inputs from (6). Note that we only require the trajectory to weakly satisfy as may be less than the necessary length .
At each time index , we check if there exists any follower that is being serviced (Line 5). If there are such followers, we update the state estimates of those followers with their true state values (Line 7). Then we modify the MTL formula as in (25) and the updated (Line 8). The MILP is solved for time with the updated state values and the modified MTL formula (Line 9). The previously computed leader control inputs are replaced by the newly computed control inputs from time index to (Line 10).