# Strategic Coalitions in Stochastic Games

The article introduces a notion of a stochastic game with failure states and proposes two logical systems with modality "coalition has a strategy to transition to a non-failure state with a given probability while achieving a given goal." The logical properties of this modality depend on whether the modal language allows the empty coalition. The main technical results are a completeness theorem for a logical system with the empty coalition, a strong completeness theorem for the logical system without the empty coalition, and an incompleteness theorem which shows that there is no strongly complete logical system in the language with the empty coalition.

Comments

There are no comments yet.

## Authors

• 18 publications
• 4 publications
01/22/2019

### The Limits of Morality in Strategic Games

A coalition is blameable for an outcome if the coalition had a strategy ...
10/22/2021

### A strong version of Cobham's theorem

Let k,ℓ≥ 2 be two multiplicatively independent integers. Cobham's famous...
06/23/2009

### A Logical Characterization of Iterated Admissibility

Brandenburger, Friedenberg, and Keisler provide an epistemic characteriz...
12/11/2020

### Epistemic Logic of Know-Who

The paper suggests a definition of "know who" as a modality using Grove-...
11/08/2019

### Duty to Warn in Strategic Games

The paper investigates the second-order blameworthiness or duty to warn ...
07/13/2017

### Strategic Coalitions with Perfect Recall

The paper proposes a bimodal logic that describes an interplay between d...
08/06/2018

### Logical Semantics and Commonsense Knowledge: Where Did we Go Wrong, and How to Go Forward, Again

We argue that logical semantics might have faltered due to its failure i...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

In this article we study coalition power in stochastic games. An example of such a game is the road situation depicted in Figure 1. In this situation, self-driving car is trying to pass self-driving car . Unexpectedly, a truck moving in the opposite direction appears on the road. For the sake of simplicity, we assume that cars and have only three strategies: slow-down (), maintain the current speed (), and accelerate (). We also assume that the truck is too heavy to significantly change the speed before a possible collision. If cars and cooperate, there are two sensible things that they can do: (i) car can accelerate letting car to slow down and to return to the position behind car ; (ii) car can slow down letting car to accelerate and to pass before it reaches the truck.

The diagram in Figure 2 describes probabilities of different outcomes of all possible combinations of actions of cars and . This diagram has five states: state is the current (“passing”) state of the system. States and represent outcomes in which car ends up, respectively, behind and ahead of car . States and are “failure” states: in the first of them there is a collision between cars, in the second car collides with the truck. The actual probabilities of possible outcomes for any give combination of actions are captured by the labeled directed edges. For example, the directed edge from state to state labeled with means that in the case (i) above, when car slows down () and car accelerates (), the system safely transitions into state with probability . This means that coalition has a strategy that avoids collision with probability . We write this as

 [a,b]1.0(Collision is avoided'').

At the same time, directed edge from state to state is labeled with . Hence, in the case (ii) above, the car will be able to pass car without collision with probability :

 [a,b]0.9(Pass without collision'').

The label on the directed edge from state to failure state denotes the fact that if car either accelerates () or maintains the same speed (), while car slows down (), then car will collide with the track with probability .

Note that car alone does not have a strategy to pass without collision with probability . Indeed, if car decides to accelerate (), then depending on if car slows down (), maintains the current speed (), or accelerates (), the probability of passing without collision will be , , and . Thus, although car , of course, has a strategy to pass without collision with probability :

 [a]0.0(Pass without collision''),

it does not have a strategy to pass that would guarantee survival with any positive probability :

 ¬[a]ε(Pass without collision'').

In this article we study properties of modality that stands for “coalition has a strategy that achieves in all non-failure states and is guaranteed to avoid failure states with probability at least ”. If , then this is essentially coalition power modality introduced by Marc Pauly p01illc ; p02 . Pauly proved the completeness of the basic logic of coalition power. His approach has been widely studied in the literature g01tark ; vw05ai ; b07ijcai ; sgvw06aamas ; av08aamas ; abvs10jal ; avw09ai ; b14sr ; ge18aamas ; al18aamas ; ga17tark ; alnr11jlc .

Alur, Henzinger, and Kupferman introduced Alternating-Time Temporal Logic (ATL) that combines temporal and coalition modalities ahk02 . Goranko and van Drimmelen gd06tcs gave a complete axiomatization of ATL. Decidability and model checking problems for ATL-like systems has also been widely studied amrz16kr ; bmmrv17lics ; bmm17aamas . Chen and Lu added the probability of achieving a goal to ATL and developed model checking algorithms for the proposed system cl07fskd . Another version of ATL with probabilistic success was proposed by Bulling and Jamroga bj09fi . They considered modality that stands for “coalition can bring about with success level of at least when the opponents behave according to ” and investigated its model checking properties. Unlike our approach in the current paper, neither of these works distinguish failure states from non-failure states. The probability of success in their systems is the probability of achieving , not the probability of avoiding a failure state. Novák and Jamroga (nj11ijcai, , Definition 2.5) defined probability of a “successful execution” of an action as probability of an action to achieve its expected effect. They called failure any execution when an action does not achieve the expected (“annotated”) effect. This, in essence, is the probability studied in our article. Since we consider a multiagent setting, we find it more intuitive to talk about failure states rather than failure of individual actions or action profiles. Later in the paper nj11ijcai , however, Novák and Jamroga introduced modality that refers to probability of an agent program to achieve goal and not a probability of individual actions to fail. Huang, Su, and Zhang combined perfect recall and coalition power to achieve a goal with a certain probability. They discussed model checking properties of their logical system hsz12aaai . Coalition power to achieve a goal with a certain probability is also used in PRISM-games, a model checker for stochastic multi-player games cfkps13tools ; kpw18ijsttt . None of these works on probabilistic extensions of ATL contain completeness results.

Alternative approaches to expressing the power to achieve a goal in a temporal setting are the STIT logic bp90krdr ; h01 ; h95jpl ; hp17rsl ; ow16sl and Strategy Logic chp10ic ; mmpv14tocl ; bmmrv17lics ; ammr18ic . Broersen, Herzig, and Troquard have shown that coalition logic can be embedded into a variation of STIT logic bht07tark . We are not aware of any probabilistic versions of either STIT or Strategy Logic.

## 2 Outline of the Contribution

In this article we axiomatize the properties of modality in stochastic game. It turns out that the axiomatization results depend significantly on whether the language allows the empty coalition or not. If the empty coalition is allowed, then one can use it to write formula , which means that the system will unavoidably survive with probability . Similarly, means that the system will unavoidably survive with probability and statement will be true in the next state. Unavoidability cannot be expressed in the language without the empty coalition. In this article we introduce two different logical systems for modality . The first of these systems, , allows coalition to be empty and the second, , does not. We describe the syntax and semantics of and in Section 3. We introduce the axioms and the inference rules for the logical systems in Section 4 and prove the soundness of the axioms in Section 5. The main technical contributions of this article are the completeness theorem for system and the strong completeness theorem for system , which we prove in Section 6. Additionally, in Section 7 we prove that no strongly sound logical system is strongly complete in the language with the empty coalition. In Section 8 we discuss the decidability of the systems. Finally, we conclude the article in Section 9.

## 3 Syntax and Semantics

In the current section we introduce the formal syntax and the formal semantics for logical systems and . The language of the first of these systems allows set to be empty and language of the second does not. In both cases, we assume a fixed finite set of agents and a fixed set of propositional variables. Additionally, a coalition is any subset of .

###### Definition 1

Let be the minimal set of formulae such that

1. for each propositional variable ,

2. for all formulae ,

3. for each coalition , each real number such that , and each formula .

In other words, is the language specified by the following grammar

 φ:=v|¬φ|φ→φ|[C]pφ.

We assume that Boolean constants and are defined in our languages in the standard way. By we denote the subset of that contains all formulae in that do not use empty coalitions. In other words, language could be defined as in Definition 1 but with an additional assumption that coalition is not empty.

Let be the set of all functions from set to set .

###### Definition 2

A tuple is a stochastic game, if

1. is a set (of states),

2. is a set (of failure states),

3. is a nonempty set (domain of actions),

4. is a function from set into set such that

 ∑s′∈SP(s,δ,s′)=1

for each state and each function ,

5. is a function from propositional variables into subsets of .

By we denote the complement of the set . A function from set is called a complete action profile.

In the introductory example depicted in Figure 2, the set of agents consists of car and car . The set of states is and the set of failure states is . The domain of actions is . Although formally a complete action profile is a function from set of all agents to the domain of actions , in the case of our introductory example it is more convenient to refer to such profiles by pairs , where and . The function is specified by labels on the directed edges in the diagram. We use commas to denote multiple functions with the same probability. For example, the label “” on the directed edge from state to state means that and .

Next is the key definition of this article. Its item 4 formally specifies the semantics of the modality . In this definition we use term action profile of a coalition to refer to a function that assigns an action to each agent of a coalition . Also, note that for any two relations , we have if every pair in relation is also in relation . If and are partial functions (functional relations), then means that function is an extension of function .

###### Definition 3

For any state of a stochastic game and any formula , the satisfiability relation is defined recursively as follows:

1. if , for any propositional variable ,

2. if ,

3. if or ,

4. when there is an action profile of coalition such that for any complete action profile if , then

1. ,

2. if , then , for each .

## 4 Logical Systems

In this section we introduce the axioms and the inference rules of logical systems and in languages and respectively. In addition to propositional tautologies in the corresponding language, each system contains the following axioms:

1. Cooperation: ,
where ,

2. Monotonicity: , where ,

3. Unachievability of Falsehood: , where .

The Cooperation axiom in the form without subscripts goes back to Marc Pauly p01illc ; p02 . Informally, it says that two coalitions can combine their strategies to achieve a common goal. The assumption that coalitions and are disjoint is important because a hypothetical common agent of these two coalitions might be required to choose different actions under strategies of these two coalitions.

Our version of the Cooperation axiom adds probability of non-failure subscript to the original version of this axiom. Perhaps one might think that the conclusion of the axiom should have subscript rather than . This is not true because, according to Definition 3, statement means that coalition has a strategy to achieve with probability of non-failure of at least regardless of what actions are chosen by the other agents.

The Monotonicity axiom says that if a coalition can achieve goal with probability of non-failure of at least , then coalition can achieve with probability of non-failure of at least , where .

Finally, the Unachievability of Falsehood axiom says that no coalition can achieve falsehood with a positive probability.

We write if formula is provable from the above axioms using the Modus Ponens and the Necessitation inference rules:

 φ,φ→ψψφ[C]0φ.

Notice that the Necessitation inference rule with positive subscript is not, generally speaking, valid. Indeed, formula is universally true but coalition may not have a strategy that guarantees the non-failure of the system with a positive probability. Thus, is not a universally true formula for .

Let if formula is provable (using only formulae in language from the above axioms using the Modus Ponens, the Necessitation, and the Monotonicity

 φ→ψ[C]pφ→[C]pψ

inference rules. We excluded the Monotonicity rule from system because, as we show below, it is derivable in .

###### Lemma 1

Monotonicity inference rule is derivable in system .

Proof. Suppose that . Thus, by the Necessitation inference rule. Consider now the following instance of the Cooperation axiom: . Therefore, by the Modus Ponens inference rule.

We write (or ) if formula (or ) is provable from the theorems of logical system (or ) and a set of additional axioms using only the Modus Ponens inference rule. Note that if set is empty, then statement is equivalent to and statement is equivalent to . We often write and if it is clear from the context which logical system we refer to. We say that set is consistent if .

###### Lemma 2 (deduction)

For either or , if , then .

Proof. Suppose that sequence is a proof from set and the theorems of our logical system that uses the Modus Ponens inference rule only. In other words, for each , either

1. , or

2. , or

3. is equal to , or

4. there are such that formula is equal to .

It suffices to show that for each . We prove this by induction on through considering the four cases above separately.

Case 1: . Note that is a propositional tautology, and thus, is an axiom of our logical system. Hence, by the Modus Ponens inference rule. Therefore, .

Case 2: . Then, .

Case 3: formula is equal to . Thus, is a propositional tautology. Therefore, .

Case 4: formula is equal to for some . Thus, by the induction hypothesis, and . Note that formula is a propositional tautology. Therefore, by applying the Modus Ponens inference rule twice.

Note that it is important for the above proof that stands for derivability only using the Modus Ponens inference rule. For example, if the Necessitation inference rule is allowed, then the proof will have to include one more case where is formula for some coalition , and some integer . In this case we will need to prove that if , then , which is not true.

###### Lemma 3 (Lindenbaum)

For either or , any consistent set of formulae can be extended to a maximal consistent set of formulae.

Proof. The standard proof of Lindenbaum’s lemma applies here (m09, , Proposition 2.14). However, since the formulae in our logical systems use real numbers in subscript, the set of formulae is uncountable. Thus, the proof of Lindenbaum’s lemma in our case relies on the Axiom of Choice.

We conclude this section by giving an example of a formal derivation in our logical systems. This result is used later in the proof of the completeness.

###### Lemma 4

For any coalitions , if , then

1. for each formula ,

2. for each formula , where set is not empty.

Proof. We give a common proof for both parts of the lemma. If , then because formula is a propositional tautology.

Suppose now that . Thus set is not empty. Note that is a propositional tautology. Thus, by the Necessitation inference rule. At the same time, because , the following formula is an instance of the Cooperation axiom:

 [D∖C]0(φ→φ)→([C]pφ→[(D∖C)∪C]max{0,p}φ).

Hence, by the Modus Ponens inference rule,

 ⊢[C]pφ→[(D∖C)∪C]max{0,p}φ.

Then, , because and .

## 5 Soundness

In this section we prove the soundness of each of our axioms as a separate lemma. The same proof applies to both system and system . The soundness of the systems is stated in the end of the section as Theorem 1.

###### Lemma 5

For any state of a stochastic game , any coalitions and , any formulae , and any real numbers such that , if , , and , then .

Proof. By Definition 3, assumption implies that there is an action profile such that for any complete action profile , if , then

1. ,

2. if , then , for each .

Additionally, by Definition 3, assumption implies that there is an action profile such that for any complete action profile if , then

1. ,

2. if , then , for each .

Let action profile of coalitions be defined as

 δ(a)={δ1(a),if a∈C1,δ2(a),if a∈C2. (1)

Action profile is well-defined because coalitions and are disjoint by an assumption of the lemma.

Consider an arbitrary complete action profile such that . Note that

 δ1⊆δ⊆δ′, (2) δ2⊆δ⊆δ′ (3)

by equation (1) and the assumption . Thus, by Definition 3 and the above assumptions 1, 2, 3, and 4,

1. ,

2. if , then , for each .

Therefore, by Definition 3.

###### Lemma 6

For any state of a stochastic game , any coalition , any formula , and any real numbers such that , if , then .

Proof. By Definition 3, assumption implies that there is an action profile such that for any complete action profile if , then

1. ,

2. if , then , for each .

Note that by assumption of the lemma. Therefore, by Definition 3.

###### Lemma 7

For any state of a stochastic game , any coalition , and any real number , if , then .

Proof. Suppose that . Thus, by Definition 3, there is an action profile such that for any complete action profile if , then

1. ,

2. if , then , for each .

Notice that due to the assumption of the lemma. Hence, there exists state such that . Thus, by item 2 above, which contradicts the definition of and Definition 3.

The soundness theorem for our logical systems with respect to the semantics described above follows from Lemma 5, Lemma 6, and Lemma 7.

###### Theorem 1

For either or , if , then for each state of each stochastic game .

## 6 Completeness

In this section we prove weak completeness of system and strong completeness of with respect to the semantics of stochastic games. These results are stated later in this section as Theorem 2 and Theorem 3.

Let be either language or and be any subset of such that (a) is closed with respect to subformulae and (b) if , then , unless the formula itself is a negation. We distinguish from the whole set so that later set could be assumed to be finite. We start the proof by defining the canonical stochastic game .

###### Definition 4

Set consists of all maximal consistent subsets of and an additional “failure” state .

.

###### Definition 6

is the set of all pairs where and is an arbitrary real number.

Informally, by choosing the action , the agent is requesting the game to transition to a non-failure state with probability at least and formula to be true at that state. The game might grant or ignore this request. In particular, the game ignores the request if .

Next, we define function . This is done in Definition 9 through auxiliary functions and . Function specifies the probability of the canonical game to transition from state under complete action profile into a into non-failure state. For each we want the game to transition to a non-failure state with probability at least if all members of coalition choose action . Thus, we define to be the maximum among such . In the definition below we assume that the maximum of the empty set is equal to 0.

###### Definition 7

For each state and each complete action profile , let

###### Lemma 8

If and set is finite, then for each state and each profile , value is well-defined and .

Proof. Consider set . Note that by Definition 1. To prove that value is well-defined by Definition 7, it suffices to show that set is finite. Indeed, set is finite because it is a subset of finite set . Therefore, set is finite by the choice of set .

###### Lemma 9

If , then for each state and each profile , value is well-defined and .

Proof. Consider set . Note that by Definition 1. To prove that value is well-defined by Definition 7, it suffices to show that set is finite. Recall that set of all agents is finite. Thus, set is finite. Therefore, set is finite because any coalition in a formula is nonempty.

Function specifies all non-failure states to which the game is able to transition from state under complete action profile with non-zero probability. Informally, if and all members of coalition choose action , then statement belongs to each set in .

###### Definition 8

For each state and each complete action profile , let be the set of all such that

 {φ|[C]pφ∈s,∀a∈C(δ(a)=(φ,p))}⊆s′.

We are now ready to define function that specifies the probability of the canonical game to transition from a state to a state under a complete action profile .

###### Definition 9

For each state , each complete action profile , and each state ,

 P(s,δ,s′)=⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩μ(s,δ)|T(s,δ)|, if s∈¯¯¯¯F and s′∈T(s,δ),1−μ(s,δ), if s∈¯¯¯¯F and s′=f,1, if s=s′=f,0, otherwise,

where is the size of set .

We prove that in Lemma 13. But first we show that is an upper bound on the sum of probabilities of transitioning to a non-failure state.

###### Lemma 10

For each state , each complete action profile ,

 ∑s′∈¯¯¯¯FP(s,δ,s′)≤μ(s,δ).

Proof. We consider the following two cases separately:

Case I: . Then, by Definition 9 and either Lemma 8 or Lemma 9,

 ∑s′∈¯¯¯¯FP(s,δ,s′) = ∑s′∈T(s,δ)P(s,δ,s′)+∑s′∈¯¯¯¯F∖T(s,δ)P(s,δ,s′) = ∑s′∈∅P(s,δ,s′)+∑s′∈¯¯¯¯F∖T(s,δ)0=0≤μ(s,δ).

Case II: . Then, by Definition 9,

 ∑s′∈¯¯¯¯FP(s,δ,s′) = ∑s′∈T(s,δ)P(s,δ,s′)+∑s′∈¯¯¯¯F∖T(s,δ)P(s,δ,s′) = ∑s′∈T(s,δ)μ(s,δ)|T(s,δ)|+∑s′∈¯¯¯¯F∖T(s,δ)0 = μ(s,δ)+0≤μ(s,δ).

###### Definition 10

.

This concludes the definition of the canonical stochastic game in cases when either is finite or . Throughout the rest of this section we assume that one of these two conditions is true.

The next lemma is the key lemma in the proof of the completeness. It shows that if , then in state coalition has no strategy to transition to a non-failure state with probability at least and to guarantee that is true in that state.

###### Lemma 11

For each state , each formula , and each , there is such that and one of the following is true

1. or

2. there is a state where and .

Proof. Consider function such that

 δ′(a)={δ(a), if a∈C,(⊤,−1), otherwise. (4)

Suppose that . We will show that there is a state such that and . Consider set

 X0={¬φ}∪{ψ|[B]qψ∈s,∀a∈B(δ′(a)=(ψ,q))}.

First, we prove that set is consistent. Suppose the opposite, thus there must exist formulae such that

 ∀i≤n∀a∈Bi(δ′(a)=(ψi,qi)) (5) ψ1,…,ψn⊢φ. (6)

Without loss of generality, we can assume that formulae are distinct. Note that sets are pairwise disjoint because of statement (5). Due to Definition 9,

 q1,…,qn≤μ(s,δ′). (7)

Additionally, by Definition 9 and the assumption of the case, we can suppose that there is an integer such that and

 qm=μ(s,δ′). (8)

Furthermore, we can assume that there is such that for each and for each .

Let us first show that . Indeed, suppose that there is . Thus, by equation (4). Hence, due to equation (5). Recall that by the choice of index . Thus , which contradicts Lemma 8 (or Lemma 9 in case of system ). Therefore, .

Next, note that for each we have because and due to equality (4) and equality (5). Hence, by statement (6). By Lemma 2 applied times,

 ⊢ψ1→(ψ2→…(ψn′→φ)…).

Note that because . So, by the Monotonicity inference rule,

 ⊢[B1]q1ψ1→[B1]q1(ψ2→…(ψn′→φ)…)).

By the Modus Ponens inference rule,

 [B1]q1ψ1⊢[B1]q1(ψ2→…(ψn′→φ)…)).

By the Cooperation axiom and the Modus Ponens rule,

 [B1]q1ψ1,[B2]q2ψ2⊢[B1∪B2]max{q1,q2}(ψ3→…(ψn′→φ)…)).

By repeating the previous step more times,

 [B1]q1ψ1,…,[Bn′]qn′ψn′⊢[B1∪⋯∪Bn′]max{q1,…,qn′}φ.

Thus, by the choice of formulae ,

 s⊢[B1∪⋯∪Bn′]max{q1,…,qn′}φ.

Then, by Lemma 4 and because ,

 s⊢[C]max{q1,…,qn′}φ.

Recall that . Thus, by inequality (7) and equation (8). Hence, . Thus, by the Monotonicity axiom and the assumption . Then, due to consistency of set , which contradicts the assumption of the lemma. Therefore, set is consistent. By Lemma 3, there is a maximal consistent extension of set . Note that by the choice of set .

Note that by Definition 8 and the choice of sets and . Thus, set is not empty. Hence, by the assumption of the case,

 P(s,δ′,s′)=μ(s,δ′)|T(s,δ′)|>0.

This concludes the proof of the lemma.

Recall that we left unproven the fact that . This will be shown in Lemma 13 using the following auxiliary lemma.

###### Lemma 12

For each state and each complete action profile , if set is empty, then .

Proof. Suppose that . Thus, by either Lemma 8 or Lemma 9. Then, by the Unachievability of Falsehood axiom. Hence, by Lemma 11 there is a complete action profile such that and one of the following is true

1. or

2. there is a state where and .

Note that assumption implies that because is a complete action profile. Thus, by either Lemma 8 or Lemma 9. Hence, there is a state such that . Then, by Definition 9. Therefore, set is not empty.

###### Lemma 13

For each state and each complete action profile ,

 ∑s′∈SP(s,δ,s′)=1.

Proof. We consider the following three cases separately.

Case I: . Thus, by Definition 9,

 ∑s′∈SP(s,δ,s′)=∑s′∈SP(f,δ,s′)=P(f,δ,f)+∑s′∈¯¯¯¯FP(f,δ,s′)=1+∑s′∈¯¯¯¯F0=1.

Case II: and . Hence, by Lemma 12. Then, by Definition 9,

 ∑s′∈SP(s,δ,s′) = P(s,δ,f)+∑s′∈T(s,δ)P(s,δ,s′)+∑s′∈¯¯¯¯F∖T(s,δ)P(s,δ,s′) = 1−μ(s,δ)+∑s′∈∅P(s,δ,s′)+∑s′∈¯¯¯¯F∖T(s,δ)0=1.

Case III: and . By Definition 9,

 ∑s′∈SP(s,δ,s′) = P(s,δ,f)+∑s′∈T(s,δ)P(s,δ,s′)+∑s′∈¯¯¯¯F∖T(s,δ)P(s,δ,s′) = 1−μ(s,δ)+∑s′∈T(s,δ)μ(s,δ)|T(s,δ)|+∑s′∈¯¯¯¯F∖T(s,δ)0 =