Fixpoint Approximation of Strategic Abilities under Imperfect Information

12/08/2016
by   Wojciech Jamroga, et al.
Synacor, Inc.
0

Model checking of strategic ability under imperfect information is known to be hard. The complexity results range from NP-completeness to undecidability, depending on the precise setup of the problem. No less importantly, fixpoint equivalences do not generally hold for imperfect information strategies, which seriously hampers incremental synthesis of winning strategies. In this paper, we propose translations of ATLir formulae that provide lower and upper bounds for their truth values, and are cheaper to verify than the original specifications. That is, if the expression is verified as true then the corresponding formula of ATLir should also hold in the given model. We begin by showing where the straightforward approach does not work. Then, we propose how it can be modified to obtain guaranteed lower bounds. To this end, we alter the next-step operator in such a way that traversing one's indistinguishability relation is seen as atomic activity. Most interestingly, the lower approximation is provided by a fixpoint expression that uses a nonstandard variant of the next-step ability operator. We show the correctness of the translations, establish their computational complexity, and validate the approach by experiments with a scalable scenario of Bridge play.

READ FULL TEXT VIEW PDF

page 1

page 2

page 3

page 4

12/27/2021

Towards the Verification of Strategic Properties in Multi-Agent Systems with Imperfect Information

In logics for the strategic reasoning the main challenge is represented ...
05/31/2018

Reasoning about Knowledge and Strategies under Hierarchical Information

Two distinct semantics have been considered for knowledge in the context...
05/31/2018

Reasoning about Knowledge and Strategies

Two distinct semantics have been considered for knowledge in the context...
03/25/2022

Bisimulations for Verifying Strategic Abilities with an Application to the ThreeBallot Voting Protocol

We propose a notion of alternating bisimulation for strategic abilities ...
11/27/2018

Making Agents' Abilities Explicit

Alternating-time temporal logics (ATL/ATL*) represent a family of modal ...
02/18/2020

Manipulating Districts to Win Elections: Fine-Grained Complexity

Gerrymandering is a practice of manipulating district boundaries and loc...

1 Introduction

There is a growing number of works that study the syntactic and semantic variants of the strategic logic for agents with imperfect information Agotnes15handbook . The contributions are mainly theoretical, and include results concerning the conceptual soundness of a given semantics Schobbens04ATL ; Jamroga03FAMAS ; Agotnes04atel ; Jamroga04ATEL ; Dima10communicating ; Guelev12stratcontexts ; Agotnes15handbook , meta-logical properties Guelev11atl-distrknowldge ; Bulling14comparing-jaamas , and the complexity of model checking Schobbens04ATL ; Jamroga06atlir-eumas ; Guelev11atl-distrknowldge ; Hoek06practicalmcheck ; Dima11undecidable ; Bulling10verification . However, there is relatively little research on the use of the logics, in particular on practical algorithms for reasoning and/or verification in scenarios where agents have a limited view of the world.

This is somewhat easy to understand, since model checking of variants with imperfect information has been proved - to -complete for agents playing memoryless strategies Schobbens04ATL ; Jamroga06atlir-eumas ; Bulling10verification and -complete to undecidable for agents with perfect recall of the past Dima11undecidable ; Guelev11atl-distrknowldge . Moreover, the imperfect information semantics of does not admit alternation-free fixpoint characterizations Bulling11mu-ijcai ; Dima14mucalc ; Dima15fallmu , which makes incremental synthesis of strategies impossible, or at least difficult to achieve. Some early attempts at verification of imperfect information strategies made their way into the MCMAS model-checker Lomuscio06uniform ; Raimondi06phd ; Lomuscio09mcmas ; Lomuscio15mcmas , but the issue was never at the heart of the tool. More dedicated attempts began to emerge only recently Pilecki14synthesis ; Busard14improving ; Huang14symbolic-epist ; Busard15reasoning . Up until now, experimental results confirm that the initial intuition was right: model checking of strategic modalities for imperfect information is hard, and dealing with it requires innovative algorithms and verification techniques.

In this paper, we propose that in some instances, instead of the exact model checking, it suffices to provide an upper and/or lower bound for the output. The intuition for the upper bound is straightforward: instead of checking existence of an imperfect information strategy, we can look for a perfect information strategy that obtains the same goal. If the latter is false, the former must be false too. Finding a reasonable lower bound is nontrivial, but we construct one by means of a fixpoint expression in alternating epistemic mu-calculus. We begin by showing that the straightforward fixpoint approach does not work. Then, we propose how it can be modified to obtain guaranteed lower bounds. To this end, we alter the next-step operator in such a way that traversing the appropriate epistemic neighborhood is seen as an atomic activity. We show the correctness of the translations, establish their computational complexity, and validate the approach by experiments with some scalable scenarios.

2 Verifying Strategic Ability

In this section we provide an overview of the relevant variants of . We refer the to Alur02ATL ; Hoek02ATEL ; Schobbens04ATL ; Bulling11mu-ijcai ; Jamroga15specificationMAS for details.

2.1 Models, Strategies, Outcomes

A concurrent epistemic game structure or CEGS is given by which includes a nonempty finite set of all agents , a nonempty set of states , a set of atomic propositions and their valuation , and a nonempty finite set of (atomic) actions . Function defines nonempty sets of actions available to agents at each state, and is a (deterministic) transition function that assigns the outcome state to state and a tuple of actions that can be executed by in . We write instead of . Every is an epistemic equivalence relation. The CEGS is assumed to be uniform, in the sense that implies .

vote_2

c

c

c

Figure 1: A simple model of voting and coercion
Example

Consider a very simple voting scenario with two agents: the voter and the coercer . The voter casts a vote for a selected candidate (action ). Upon exit from the polling station, the voter can hand in a proof of how she voted to the coercer (action ) or refuse to hand in the proof (action ). The proof may be a certified receipt from the election authorities, a picture of the ballot taken with a smartphone, etc. After that, the coercer can either punish the voter () or not punish ().

The CEGS  modeling the scenario for is shown in Figure 1. Proposition labels states where the voter has already voted for candidate . Proposition indicates states where has been punished. The indistinguishability relation for the coercer is depicted by dotted lines.

A strategy of agent is a conditional plan that specifies what is going to do in every possible situation. Formally, a perfect information memoryless strategy for can be represented by a function satisfying for each . An imperfect information memoryless strategy additionally satisfies that whenever . Following Schobbens04ATL , we refer to the former as -strategies, and to the latter as -strategies.

A collective -strategy , for and , is a tuple of individual -strategies, one per agent from . The set of all such strategies is denoted by . By we denote the strategy of agent selected from .

Given two partial functions , we say that extends (denoted ) if, whenever is defined, we have . A partial function is called a partial -strategy for if is extended by some strategy . A collective partial x-strategy is a tuple of partial x-strategies, one per agent from .

A path is an infinite sequence of states such that there is a transition between each . We use to denote the th position on path (starting from ). Function returns the set of all paths that can result from the execution of strategy from state . We will sometimes write instead of . Moreover, function collects all the outcome paths that start from states that are indistinguishable from to at least one agent in .

2.2 Alternating-Time Temporal Logic

We use a variant of that explicitly distinguishes between perfect and imperfect information abilities. Formally, the syntax is defined by the following grammar:

where , and . We read as can identify and execute a strategy that enforces ,” as “in the next state,” as “now and always in the future,” and as “until.” can be read as might be able to bring about if allowed to make lucky guesses along the way.” We focus on the kind of ability expressed by . The other strategic modality (i.e., ) will prove useful when approximating .

The semantics of can be defined as follows:

iff ,

iff ,

iff and ,

iff there exists such that for all we have ,

iff there exists such that for all and we have ,

iff there exists such that for all there is for which and for all . We will often write instead of to express one-step abilities under imperfect information. Additionally, we define “now or sometime in the future” as .

Example

Consider model from Example 2.1. The following formula expresses that the coercer can ensure that the voter will eventually either have voted for candidate (presumably chosen by the coercer for the voter to vote for) or be punished: . We note that it holds in for any . A strategy for that validates the property is for , and symmetrically for .

Consequently, the formula saying that the voter can avoid voting for candidate and being punished, is false in for all .

We refer to the syntactic fragment containing only modalities as , and to the one containing only modalities as .

Proposition (Alur02ATL ; Schobbens04ATL ; Jamroga06atlir-eumas )

Model checking is -complete and can be done in time where is the number of transitions in the model and is the length of the formula.

Model checking is -complete wrt and .

Remark

The semantics of encodes the notion of “subjective” ability Schobbens04ATL ; Jamroga04ATEL : the agents must have a successful strategy from all the states that they consider possible when the system is in state . Then, they know that the strategy indeed obtains . The alternative notion of “objective” ability Bulling14comparing-jaamas requires a winning strategy from state alone. We focus on the subjective interpretation, as it is more standard in and more relevant in game solving (think of a card game, such as poker or bridge: the challenge is to find a strategy that wins for all possible hands of the opponents).

Note that if and contains no nested strategic modalities, then the subjective and objective semantics of at coincide. Moreover, model checking and in according to the objective semantics can be easily reduced to the subjective case by adding a spurious initial state , with transitions to all states in , controlled by a “dummy” agent outside  Pilecki17smc .

2.3 Reasoning about Knowledge

Having indistinguishability relations in the models, we can interpret knowledge modalities in the standard way:

  • iff for all such that .

The semantics of “everybody knows” () and common knowledge () is defined analogously by assuming the relation to aggregate individual uncertainty in , and to be the transitive closure of . Additionally, we take to be the minimal reflexive relation. We also use to denote the image of wrt relation .

Example

The following formulae hold in for any by virtue of strategy presented in Example 2.2:

: The coercer has a strategy so that, eventually, the voter is punished unless the coercer has learnt that the voter voted as instructed;

: Moreover, the coercer can guarantee that if he learns that the voter obeyed, then the voter will not be punished.

2.4 Alternating Epistemic Mu-Calculus

It is well known that the modalities in have simple fixpoint characterizations Alur02ATL , and hence can be embedded in a variant of -calculus with as the basic modality. At the same time, the analogous variant of -calculus for imperfect information has incomparable expressive power to  Bulling11mu-ijcai , which suggests that, under imperfect information, and fixpoint specifications provide different views of strategic ability.

Formally, alternating epistemic -calculus (AEC) takes the next-time fragment of , possibly with epistemic modalities, and adds the least fixpoint operator . The greatest fixpoint operator is defined as dual. Let be a set of second-order variables ranging over . The language of AECis defined by the following grammar:

where , , , , and the formulae are –positive, i.e., each free occurrence of is in the scope of an even number of negations. We define . A formula of AECis alternation-free if in its negation normal form it contains no occurrences of (resp. ) on any syntactic path from an occurrence of (resp. ) to a bound occurrence of .

The denotational semantics of af-AEC (i.e., the alternation-free fragment of AEC) assigns to each formula the set of states where is true under the valuation :

,     ,

,

,

,

,

. If contains no free variables, then its validity does not depend on , and we write instead of .

Example

Consider the AECformula , i.e., the “naive” fixpoint translation of the formula from Example 2.2. The fixpoint computation produces the whole set of states . Thus, in particular, .

Proposition (Bulling11mu-ijcai )

Model checking af-AECwith strategic modalities for up to 2 agents is -complete and can be done in time where is the size of the largest equivalence class among , and is the length of the formula.

For coalitions of size at least 3, the problem is between and wrt and .

Thus, alternation-free alternating epistemic -calculus can be an attractive alternative to from the complexity point of view. Unfortunately, formulae of admit no universal translations to af-AEC. Formally, it was proved in (Bulling11mu-ijcai, , Proposition 6) that af-AECdoes not cover the expressive power of . The proof uses formulae of type , but it is easy to construct an analogous argument for . In consequence, long-term strategic modalities of do not have alternation-free fixpoint characterizations in terms of the next-step strategic modalities . A similar result was proved for in (Dima14mucalc, , Theorem 11).

3 Lower Bounds for Abilities

The complexity of AECmodel checking seems more attractive than that of . Unfortunately, the expressivity results cited in Section 2.4 imply that there is no simple fixpoint translation which captures exactly the meaning of operators. It might be possible, however, to come up with a translation that provides a lower bound of the actual strategic abilities, i.e., such that implies . In other words, a translation which can only reduce, but never enhance the abilities of the coalition.

We begin by investigating the “naive” fixpoint translation that mimics the one for , and show that it works in some cases, but not in general. Then, we propose how to alter the semantics of the nexttime modality so that a general lower bound can be obtained. We focus first on reachability goals, expressed by formulae , and then move on to the other modalities.

3.1 Trying It Simple for Reachability Goals

We assume from now on that is a formula of , is a CEGS, and is a state in (unless explicitly stated otherwise). We start with the simplest translation, analogous to that of Alur02ATL : . Unfortunately, this translation provides neither a lower nor an upper bound. For the former, use model in Figure 2, and observe that but . For the latter, take model in (Bulling11mu-ijcai, , Figure 1), and observe that but .

Proposition

does not imply . The converse implication does not hold either.

1

Figure 2: CEGS : a counterexample for

1

1

1

1

1

1

1

2

1

Figure 3: : a counterexample for

Consider now a slightly stronger fixpoint specification: . This new translation works to an extent, as the following proposition shows.

Proposition

  1.  iff ;

  2. If , then implies , but the converse does not hold;111 Note that, for , is equivalent to .

  3. If , then does not imply , and vice versa.

Proof

Case 1: follows from the fact that for the empty coalition the –reachability is equivalent to the –reachability, which in turn has a fixpoint characterization.

Case 2: Let us assume that for some . We define the sequence of af-AECformulae s.t. and , for all . From Kleene fixed-point theorem we have , and is a non-decreasing monotone sequence of subsets of . Now, we prove that for each there exists a partial strategy s.t. , , and . The proof is by induction on . We constructively build from for each . The base case is trivial. For the inductive step, firstly observe that for each if , then . As is an equivalence relation, for each either or . In the first case we put . In the second case, we know that there exists a strategy s.t. . We thus put for all , which concludes the inductive proof.

We finally define the partial strategy . For each s.t. , either , or is reached along each path consistent with any extension of to a full strategy.

For the converse implication, take model in (Bulling11mu-ijcai, , Figure 1), and observe that but .

Case 3: Consider the CEGS presented in Figure 3. We assume that and , for . In the remaining states the protocols allow only one action. For clarity, we omit from the figure the transitions leaving the states , and , leading to state . Assume now . Note that and . For larger coalitions , we extend the model with a sufficient number of spurious (idle) agents.

For the other direction, use the counterexample from Case 2, extended with appropriately many spurious agents.

(A) (B)

1

1

Figure 4: Lower bounds are not tight: (A) ; (B)

As Propositions 3.1 and 3.1 show, translation provides lower bounds for verification only in a limited number of instances. Also, the bound is rather loose, as the following example demonstrates.

Example

Consider the single-agent CEGS presented in Figure 4A. The sole available strategy, in which agent selects always action , enforces eventually reaching , i.e., . On the other hand, . This is because the next-step operator in requires reaching simultaneously from all the states indistinguishable from , whereas is reached from in one and two steps, respectively.

3.2 Steadfast Next Step Operator

To obtain a tighter lower bound, and one that works universally, we introduce a new modality. can be seen as a semantic variant of the next-step ability operator where: (i) agents in look for a short-term strategy that succeeds from the “common knowledge” neighborhood of the initial state (rather than in the “everybody knows” neighborhood), and (ii) they are allowed to “steadfastly” pursue their goal in a variable number of steps within the indistinguishability class. In this section, we propose the semantics of and show how to revise the lower bound. Some additional insights are provided in Section 4.

We begin by defining the auxiliary function so that collects all such that all the paths executing from eventually reach without leaving , except possibly for the last step:

.

The steadfast next-step operator is defined as follows:

  • iff there exists such that .

Now we can propose our ultimate attempt at the lower bound for reachability goals: , with the following result.

Proposition

If , then . The converse does not universally hold.

Proof

The proof is similar to the proof of Proposition 3.1. As previously, we define a sequence of af-AECformulae s.t. and , for all . We also use a sequence with . From Kleene fixed-point theorem we have . Observe that, as is an equivalence relation, we have for each and that if , then .

We prove that for each there exists a partial strategy s.t. , , and . The proof is by induction on . In the base case of observe that if then there exists a partial strategy with s.t. every stays in until it reaches a state where holds. We can now define which is uniform, and reaches on all execution paths. For the inductive step, we divide the construction of in two cases. Firstly, if , then we put . Secondly, let . In this case there exists a partial strategy with s.t. each outcome stays in until it reaches a state s.t. either or . In the latter, from the inductive assumption we know that following always leads to reaching without leaving . We thus take which, again, is uniform, and reaches on all execution paths. This concludes the inductive part of the proof.

Finally, we build a partial strategy , whose any extension is s.t. for each , if , then a state in which holds is eventually reached along each outcome path . This concludes the proof of the implication.

To see that the converse does not hold, consider model in Figure 4B. We have that , but .

Thus, indeed provides a universal lower bound for reachability goals expressed in .

3.3 Lower Bounds for “Always” and “Until”

So far, we have concentrated on reachability goals. We now extend the main result to all the modalities of :

  1. If , then ;

  2. If , then .

Proof

Case 1: Let us define the sequence of formulae s.t. and , for all . From Kleene fixed-point theorem, . It suffices to prove that for each there exists a strategy s.t. . The proof is by induction on , with the trivial base case. Assume that the inductive assumption holds for some . From the definition of the steadfast next-step operator we can define for each equivalence class a partial strategy s.t. . We now construct

.


Intuitively, enforces that a path leaving each stays within for at least steps. Moreover, for all . Thus, enforces that a path leaving each stays within for infinitely many steps, which concludes the proof. Note that the correctness of the construction relies the fact that is an equivalence relation.

Case 2: analogous to Proposition 3.2.

4 Discussion & Properties

Theorem 3.3 shows that provides a correct lower bound of the value of for all formulae of . In this section, we discuss the tightness of the approximation from the theoretical point of view. An empirical evaluation will be presented in Section 6.

4.1 Comparing and for Reachability Goals

Translation updates by replacing the standard next-step ability operator with the “steadfast next-step ability” . The difference between and is twofold. First, looks for a winning strategy in the “everybody knows” neighborhood of a given state (i.e., ), whereas looks at the “common knowledge” neighborhood (i.e., ). Secondly, allows to “zig-zag” across until a state satisfying is found.

Actually, the first change would suffice to provide a universally correct lower bound for . The second update makes it more useful in models where agents may not see the occurrence of some action, such as of Figure 4A. To see this formally, we show that provides a strictly tighter approximation than on singleton coalitions:

Proposition

For , if , then . The converse does not universally hold.

Proof

It suffices to observe that implies , for any . Note that this is true only for single-agent coalitions. For the converse, notice that in CEGS from Figure 4A we have and .

On the other hand, if agent always sees whenever an action occurs, then and coincide for ’s abilities. Formally, let us call CEGS lockstep for if, whenever there is a transition from to in , we have . The following is straightforward.

Proposition

If is lockstep for , then iff . In consequence, iff .

4.2 When is the Lower Bound Tight?

An interesting question is: what is the subclass of CEGS’s for which is tight, i.e., the answer given by the approximation is exact? We address the question only partially here. In fact, we characterize a subclass of CEGS’s for which is certainly not tight, by the necessary condition below.

Let or for some . We say that strategy is winning for from if it obtains for all paths in