Multiagent systems (MAS) consisting of multiple autonomous agents are a wide adopted paradigm of intelligent systems. Game-based models and associated logics, as the foundation of MAS, have received tremendous attention in recent years. The seminar work [Alur et al.2002] proposed concurrent game structures (CGS) as the model of MAS and alternating-time temporal logics (typically ATL and ATL) as specification languages for expressing temporal goals. In a nutshell, a CGS consists of multiple players which are used to represent autonomous agents, components and the environment. The model describes how the MAS evolves according to the collective behavior of agents. ATL/ATL, an extension of the Computational Tree Logics (CTL/CTL), features coalition modality . The formula expresses the property that the coalition (i.e. the agent group) has a collective strategy to achieve a certain goal specified by .
A series of extensions of ATL-like logics have been studied which take different agents’ abilities into account. These abilities typically include whether the agents can identify the current state of the system completely or only partially (perfect vs. imperfect information), and whether the agents can memorize the whole history of observations or simply part of them (perfect vs. imperfect recall). Different abilities usually induce distinct semantics of logics, which are indeed necessary because of the versatility of problem domains. These semantic variants and their model-checking problems comprise subjects of active research for almost two decades, to cite a few [Schobbens2004, Jamroga and van der Hoek2004, Ågotnes et al.2007, Dima and Tiplea2011, Bulling and Jamroga2011, Laroussinie and Markey2015].
While the agents’ abilities play a prominent role [Bulling and Jamroga2014], the semantics of ATL-like logics only refers to them implicitly. In other words, the logic per se does not specify what ability an agent has; instead one can infer the ability an agent requires by examining the goal specified in the logic. This approach, being elegant and valuable to understand the relationship between different abilities, suffers from a few shortcomings: (1) From the modelling perspective, it is common in practice that agents in an MAS vary in their abilities (for instance, agents modeling sensors may not identify the complete state of system so can only use strategies with imperfect information). When building up a model, these abilities ought to be encoded explicitly. Such modeling flexibility is not supported by existing formalisms. (2) From the semantic perspective, ATL-like logics may exhibit some counter-intuitive semantics. is interpreted as the coalition has a collective strategy to achieve the goal “no matter what the other agents do” rather than “no matter which strategies the other agents choose”, hence, neglects the (multi-player) game nature in the evolution of MAS. For instance, when it comes to the imperfect information/recall setting, only agents in the formula are assumed to use imperfect information/recall strategies, while the agents not in may still use perfect information and perfect recall strategies. Even worse, if the coalition modalities are nested, the same agent may have different abilities to fulfill the objectives specified in different sub-formulae, which results in inconsistency. This phenomenon has also been mentioned in [Mogavero et al.2014, Cermák et al.2014] which proposed a strategic logic does make explicit references to strategies of all agents including those not in . However, all agents in strategic logic should have same abilities.
To summarize, it occurs to us that the current approach in which the temporal formulae are with implicit agents’ abilities at the semantics level impedes necessary modeling flexibility and often yields unpleasant (even weird) semantics. Instead, we argue that coupling agents’ abilities at the syntactic level of system models would deliver a potentially better approach to overcome the aforementioned limitations. Bearing the rationale in mind, we propose a new MAS model, named Agents’ Abilities Augmented Concurrent Game Structures (ACGS), which encompasses agents’ abilities explicitly.
We investigate ATL/ATL over ACGS. We show that in general the new semantics of ATL/ATL over ACGS is incomparable with others even if the underlying CGS models are the same. We also study the model-checking problem of ATL/ATL over ACGS. We show that this problem is generally undecidable, as the problem of ATL over CGS under the imperfect information and perfect recall setting is already undecidable [Dima and Tiplea2011]. However, we manage to show that the model-checking problem for ATL (resp. ATL) on ACGS is 2EXPTIME-complete (resp. in PSPACE) when the imperfect information and perfect recall strategies are disallowed. We implement our algorithms in a prototype tool and conduct experiments on some standard applications from the literature. The results confirm the feasibility of our approach.
The source code of our tool is available at [MCMAS-ACGS2018] which also includes some further experiments and comparison of ATL/ATL semantics between CGS and ACGS.
2 Concurrent Game Structures
Given an infinite word , we denote by the symbol , the prefix , and the suffix . Given a finite word , we denote by the symbol for , and the symbol .
Let denote a finite set of atomic propositions. A concurrent game structure (CGS) is a tuple
where is a finite set of states; is a set of initial states; is a finite set of agents; is a finite set of local actions of agent ; is an epistemic accessibility relation (an equivalence relation); is a protocol function such that for every ; is a transition function with being a set of joint actions; and is a labeling function which assigns each state with a set of atomic propositions. Given a joint action , we use to denote the local action of agent in .
A path is an infinite sequence of states such that for every , for some . Two sequences and are indistinguishable for agent , denoted by , if for every , .
Strategies. Typical agents’ abilities are captured by the following types of strategies [Schobbens2004]. For ,
with , ;
with , ;
, the same as the Ir-strategy but with the additional constraint ;
, the same as the IR-strategy but with the additional constraint .
Intuitively, (resp. ) signals that agents can only observe partial information characterized via epistemic accessibility relations (resp. complete information with all epistemic accessibility relations being the identity relation), while (resp. ) signals that agents can make decision based on the current observation (resp. the whole history of observations). We will, by slightly abusing notation, extend Ir-strategies and ir-strategies to the domain such that for all , . We denote by for the set of -strategies for agent .
Outcomes. A collective -strategy of a set of agents is a function assigning each agent with a -strategy . For and , we denote the local action of agent by , and the set by .
Given a state , two collective -strategies and yield a path , denoted by , where and for every , , for and for .
For every state and collective -strategy of , the outcome function is defined as follows:
i.e., the set of all possible plays that may occur when each agent enforces its -strategy from the state . The subscript is dropped in when it is clear from the context.
3 Alternating-Time Temporal Logics
The alternating-time temporal logic ATL/ATL is an extension of the branching-time logic CTL/CTL by replacing the existential path quantifiers with coalition modalities [Alur et al.2002]. Intuitively, the formula expresses that the set of agents has a collective strategy to achieve the goal no matter which strategies the agents in choose. Formally, ATL is defined by the following grammar:
where (resp. ) denotes state (resp. path) formulae, and .
The derived operators are defined as usual: , , , , and . An LTL formula is an ATL path formula by restricting to atomic propositions.
The semantics of ATL is traditionally defined over CGS. When agents’s abilities are considered, it is often parameterized with a strategy type , denoted by ATL [Bulling and Jamroga2014]. Formally, let be a CGS and be a state of , the semantics of ATL (i.e. the satisfaction relation) is defined inductively as follows: (where is a state or path )
iff there exists a collective -strategy of agents such that , ;
iff such that and , ;
iff and ;
Vanilla ATL. ATL is a sublogic of ATL where each occurrence of the coalition modality is immediately followed by a temporal operator. Formally, ATL is defined by the following grammar:
where and .
Remark that the operator cannot be defined using other operators in ATL with imperfect information [Laroussinie et al.2008], so is included for completeness.
Given an ATL formula , a CGS and a strategy type , the model-checking problem is to determine whether or not, for each initial state of the CGS .
Some Semantic Issues. We observe that the semantics of ATL/ATL refers to the agents’ abilities in an implicit manner. For the formula , the specified -strategies only apply to agents in while the agents in could still choose beyond -strategies (e.g. -strategies). In other words, has a collective -strategy to achieve no matter what the other agents do. When is as in the original work by [Alur et al.2002], this interpretation of is plausible, as “no matter what the other agents do” is effectively the same as “no matter which strategies the other agents choose”. However, when is set to be more restricted than , agents not in are still allowed to use -strategies.
As mentioned in the introduction, this results in a few shortcomings. From a modeling perspective, arguably agents’ abilities should be decided by the practical scenario. Namely, they should be fixed when the model is built, and all agents stick to their respective abilities independent of logic formulae. More concretely, from the semantic perspective, the existing semantics does not take into account the abilities of agents who are not in , and neglects the (multi-player) game nature in the evolution of MAS. As a result, it may exhibit some counter-intuitive semantics. For instance, consider two formulae and such that agent , may have different abilities to achieve and . Let us consider an autonomous road vehicle scenario to see why this is not ideal. There are several autonomous cars which can only observe partial information and have bounded memory. An MAS model consists of agents in modeling autonomous cars, and an additional environment agent . We can reasonably assume that all the car agents use -strategies, while uses -strategies. The property expresses that autonomous cars can cooperatively achieve the goal no matter what strategies the other cars and the environment choose. Verifying that satisfies under the existing semantics would allow car agents to use -strategies. If satisfies , then the result is conclusive, i.e., holds for the system. However, if invalidates , we cannot deduce that fails because we overestimate the abilities of agents in when evaluating . In other words, for the formula under where , it seems to be inappropriate to render the agents not in extra powers of to potentially defeat agents from when their abilities are actually much weaker.
4 Agents’ Abilities Augmented Concurrent Game Structures
In this section, we introduce agents’ abilities augmented concurrent game structures (ACGS in short), which explicitly equip each agent with a strategy type from T. As such, agents have fixed abilities throughout their lives for a given CGS. Formally, an ACGS is a pair , where is a CGS and is a function that assigns a strategy type to the agent . We assume that, for each agent with , is an identity relation, as agents with perfect information should be able to distinguish two distinct states. Paths of are defined the same as for the CGS , but strategies and outcomes of have to be redefined as follows.
Strategies and Outcomes. Let be a set of agents. A collective strategy of in is a function that assigns each agent with a -strategy .
Given a state , for every collective strategy of the set of agents, the outcome of is the set of all possible paths that may occur when each agent enforces its -strategy from state , and other agents can only choose -strategies instead of general -strategies. Formally, is defined as
We will omit the subscript from when it is clear from context.
Semantics of ATL/ATL. The difference of outcomes between ACGS and CGS induces a distinct semantics of ATL on ACGS than CGS. Let be an ACGS and be a state in , the semantics of ATL on is defined similar to the one on CGS, except that the semantics of the state formulae of the form is defined as follows:
iff there exists a collective strategy of such that for all .
Remark that this semantics takes into account whether the agents from have perfect or imperfect information/recall.
Given an ACGS and an ATL/ATL formula , the model-checking problem is to determine whether holds, for every initial state of . Given a state formula , let denote the set of all the states of that satisfies .
The semantics of ATL/ATL defined on ACGS is different from the one defined on CGS, hence they are incomparable.
There are an ACGS , an ATL/ATL formula , and a type such that for all and holds, but .
Proof. Let us consider the CGS shown in Figure 1. There are two agents , four states ( is the initial state), and , is the identity relation, for every and . Consider the function such that and , it is easy to see that , while .
5 Model-Checking Algorithms
It has been shown that the Turing Halting problem can be reduced to the model-checking problem of CGS against the ATL formula under the setting [Dima and Tiplea2011], where is an atomic proposition. By adapting the proof, we get that:
The model-checking problem for ACGS against the ATL/ATL formula is undecidable, where , and for .
By Theorem 1, we focus on the model-checking problem of ACGS by restricting the function to . In general, we propose model-checking algorithms which iteratively compute the set of states satisfying state formulae from the innermost subformulae. The main challenge is to compute . To this end, we first show how to compute for a simple formula , and then present the more general algorithm. An ATL/ATL formula is called simple if is an LTL formula.
Let us fix an ACGS with and a simple formula . Given a set of agents and a strategy type , we denote by the set .
5.1 Model-Checking Simple ATL Formulae
For a simple ATL formula , it is easy to see that whether agents in have perfect call or not does not matter if these agents have perfect information abilities.
Given an ACGS with , and a simple ATL formula , let be a function such that for every , if and , otherwise . For every state in ,
By Proposition 2, all the agents in with -strategies can be seen as agents with -strategies. All the agents in with -strategies can be seen as agents with -strategies (i.e., all the epistemic accessibility relations of them are the identity relation). Therefore, we can assume that for all , and for all . For two collective strategies and , let be the ACGS obtained from by enforcing strategies and , namely, by removing transitions whose actions of agents in do not conform to and . We have that
Computing amounts to CTL model-checking, which can be done in polynomial time (and thus in polynomial space) in the size of and [Clarke et al.1983]. Since the number of strategies and is finite, we get that:
For the simple ATL formula , can be computed in PSPACE.
5.2 Model-Checking Simple ATL Formulae
We compute by a reduction to the problem of computing the winning region of a turn-based two-player parity game. We first introduce some basic concepts which will be used in our reduction.
A deterministic parity automaton (DPA) is a tuple , where is a finite set of states, is the input alphabet, is a transition function, is the initial state and is a rank function. A run of over an -word is an infinite sequence of states such that for every , . Let be the set of states visited infinitely often in . An infinite word is recognized by if has a run over this word such that is even. For the LTL formula , one can construct a DPA with states and rank such that recognizes all the -words satisfying [Piterman2006].
A (turned-based, two-player) parity game is a tuple , where for is a finite set of vertices controlled by Player-, is a finite set of edges, and is a rank function. A play starting from is an infinite sequence of vertices such that for every , . A strategy of Player- is a function such that for every and , . Given a strategy for Player-0 and a strategy for Player-1, let be the play where Player-0 and Player-1 enforce their strategies and . is a winning strategy for Player-0 if is even for every strategy of Player-1. The winning region of Player-0, denoted by , is the set of vertices from which Player-0 has a winning strategy.
A partial strategy of is a partial function such that for each and , if is defined, then for all with , . We denote by the domain of . Let (resp. ) be the set of partial strategies of (resp. ). Let denote the set , and denote the partial strategy such that . Given a state , let be the set of functions such that for every , . Let and .
We define a parity game , where , , is a function such that for every ,
, , , and ,
is defined as follows:
for and ;
for and ;
for and ;
for every and , where is the largest set such that the follows hold: for every ,
there exists such that and for every , ;
there exists such that , and .
In this reduction, encodes a collective -strategy of agents in , the collection of ’s in one play of encodes a collective -strategy of agents in , and each encodes a collective -strategy of agents in . The imperfect information abilities of agents are ensured by the definitions partial strategies.
To check whether , starts with the vertex . At first step, Player-0 chooses a function meaning that all agents in choose an -strategy. Next, moves from to which let the DPA start with (note that Player-1 has only one choice at this step). At a vertex controlled by Player-0, Player-0 chooses actions for agents in by choosing one function . Then Player-1 chooses actions for agents in with respect to the chosen actions of agents in tracked by . These selections of actions together with and determine a joint action , based on which moves to such that is the next state of under , is the next state of in which allows to mimics the run of over the -word induced by the play of . During this step, is dropped from the vertex of , as corresponds to actions of agents in and needs not to track. The actions of agents in are tracked by computing from . This ensures imperfect recall abilities of agents in . We then can get that:
The winning region of Player-0 in can be computed in polynomial time of [Jurdzinski2000]. In this reduction, each contributes at most sets of . Therefore, is exponential in . Recall that . Consequently, we have
For the simple ATL formula , can be computed in 2EXPTIME.
5.3 The Overall Algorithm
We now present the overall procedure, which computes from the innermost subformulae. Algorithm 1 shows the pseudo code, which takes an ACGS and an ATL/ATL formula as inputs, and outputs . Then satisfies iff the set of initial states of is a subset of . We also incorporate epistemic modalities from [van der Hoek and Wooldridge2003, Cermák et al.2014] into our algorithm with the following semantics:
where is a state formula, , , (note that is the transitive closure of ). and respectively denote that knows, every agent in knows, agents in have common knowledge and agents in have distributed knowledge, on the fact . The ATL (resp. ATL) logic extended with these epistemic modalities is called ATLK (resp. ATLK) logic. Given a state and a binary relation , we denote by the set .
By Lemma 1 and Lemma 3, the model-checking problem for ATLK and ATLK on ACGS can be solved in PSPACE and 2EXPTIME respectively. As the model-checking problem of ATL on CGS under setting is 2EXPTIME-complete [Alur et al.2002], we have that
The model-checking problem for ATLK (resp. ATLK) on ACGS is 2EXPTIME-complete (resp. in PSPACE).
6 Implementation and Experiments
We have implemented the ATLK/ATLK model-checking algorithms in MCMAS [Lomuscio et al.2017]. We conducted experiments on the castle game (CG) [Pilecki et al.2014]. All experiments were conducted on a computer with 1.70GHz Intel Core E5-2603 CPU and 32GB of memory.
In this CG game, there are several agents modelling workers and an environment agent. Each worker works for the benefit of a castle, and the environment keeps track of the Health Points (HP) of castles. Each castle preserves an HP valued up to 3, and 0 means it’s defeated. Workers are able to attack a castle which they don’t work for, or defend the castle which they work for, or do nothing. The castle gets damaged if the number of attackers is greater than the number of defenders, and the differences influence its HP. In this model, the number of states is , the environment agent has 1 local action, and each worker agent has 4 local actions, where denotes the number of workers.
In this experiment, we consider ACGS consisting of three worker agents and an environment agent , where worker works for the castle .
: expresses that workers and can make castle defeated, no matter which strategies other agents use.