    # Kuhn's Equivalence Theorem for Games in Intrinsic Form

We state and prove Kuhn's equivalence theorem for a new representation of games, the intrinsic form. First, we introduce games in intrinsic form where information is represented by σ-fields over a product set. For this purpose, we adapt to games the intrinsic representation that Witsenhausen introduced in control theory. Those intrinsic games do not require an explicit description of the play temporality, as opposed to extensive form games on trees. Second, we prove, for this new and more general representation of games, that behavioral and mixed strategies are equivalent under perfect recall (Kuhn's theorem). As the intrinsic form replaces the tree structure with a product structure, the handling of information is easier. This makes the intrinsic form a new valuable tool for the analysis of games with information.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

From the origin, games in extensive form have been formulated on a tree. In his seminal paper Extensive Games and the Problem of Information , Kuhn claimed that “The use of a geometrical model (…) clarifies the delicate problem of information”. This tells us that the proper handling of information was a strong motivation for Kuhn’s extensive games. On the game tree, moves are those vertices that possess alternatives, then moves are partitioned into players moves, themselves partitioned into information sets (with the constraint that no two moves in an information set can be on the same play). Kuhn mentions agents, one agent per information set, to “personalize the interpretation” but the notion is not central (to the point that his definition of perfect recall “obviates the use of agents”).

By contrast, in the so-called Witsenhausen’s intrinsic model [9, 10], agents play a central role. Each agent is equipped with a decision set and a -field, and the same for Nature. Then, Witsenhausen introduces the product set and the product -field. This product set hosts the agents’ information subfields. The Witsenhausen’s intrinsic model was elaborated in the control theory setting, in order to handle how information is distributed among agents and how it impacts their strategies. Although not explicitly designed for games, Witsenhausen’s intrinsic model had, from the start, the potential to be adapted to games. Indeed, in 

Witsenhausen places his own model in the context of game theory by referring to von Neuman and Morgenstern

, Kuhn  and Aumann .

In this paper, we introduce a new representation of games that we call games in intrinsic form. Game representations play a key role in their analysis (see the illuminating introduction of the book ), and we claim that games in intrinsic form display appealing features. In the philosophy of the tree-based extensive form (Kuhn’s view), the temporal ordering is hardcoded in the tree structure: one goes from the root to the leaves, making decisions at the moves, contingent on information, chance and strategies. For Kuhn, the time arrow (tree) comes first; information comes second (partition of the move vertices). By contrast, for Witsenhausen, information comes first; the time arrow comes (possibly) second, under a proper causality assumption contingent to the information structure.

Not having a hardcoded temporal ordering makes mathematical representations less constrained, hence more general. Moreover, Witsenhausen’s framework makes representations more intrinsic. As an illustration, let us consider a game where two players play once but at the same time. To formulate it on a tree requires to arbitrarily decide which of the two plays first. This is not the case for games in intrinsic form, where each player/agent is equipped with an information subfield and strategies that are measurable with respect to the latter; writing the system of two equations that express decisions as the output of strategies leads to a unique outcome, without having to solve one equation first, and the other second.

The tree representation of games has its pros and cons. On the one hand, trees are perfect to follow step by step how a game is played as any strategy profile induces a unique play: one goes from the root to the leaves, passing from one node to the next by an edge that depends on the strategy profile. On the other hand, in games with information, information sets are represented as “union” of tree nodes that must satisfy restrictive axioms, and such unions do not comply in a natural way with the tree structure, which can render the game analysis delicate [1, 4, 5]. By contrast, the notion of Witsenhausen’s intrinsic games (W-games) does not require an explicit description of the play temporality, and the intrinsic form replaces the tree structure with a product structure, more amenable to mathematical analysis. If the introduction of the model may seem involved, we argue that the resulting structure is a powerful mathematical tool, because there are many situations in which it is easier to reason and discuss with mathematical formulas than with trees.

We illustrate our claim with a proof of the celebrated Kuhn’s equivalence theorem for games in intrinsic form. Indeed, as a first step in a broader research program, we show that equivalence between mixed and behavioral strategies holds under perfect recall for W-games. More precisely, our proof relies on an equivalence between behavioral, mixed and a new notion of product-mixed strategies. These latter form a subclass of mixed strategies. In the spirit of , in a product-mixed strategy, each agent (corresponding to time index in ) generates strategies from a random device that is independent of all the other agents. We prove that, under perfect recall for W-games, any mixed strategy of a player is not only equivalent to a behavioral strategy, but also to a product-mixed strategy where all the agents under control of the player randomly select their pure strategy independently of the other agents.

The paper is organized as follows. In Sect. 2, we present the finite version of Witsenhausen’s intrinsic model. Then, in Sect. 3, we propose a formal definition of games in intrinsic form (W-games), and then discuss three notions of “randomization” of pure strategies — mixed, product-mixed and behavioral. Finally, we derive an equivalent of Kuhn’s equivalence theorem for games in intrinsic form in Sect. 4. In Appendix A, we present background material on fields, atoms and partitions, as these notions lay at the core of Witsenhausen’s intrinsic model in the finite case. In all the paper, we adopt the convention that a player is female (hence using “she” and “her”), whereas an agent is male (“he”, “his”).

## 2 Witsenhausen’s intrinsic model (the finite case)

In this paper, we tackle the issue of information in the context of finite games. For this purpose, we will present the so-called intrinsic model of Witsenhausen [10, 6] but with finite sets rather than with infinite ones as in the original exposition. We refer the reader to Appendix A for background material on fields, atoms and partitions.

In §2.1, we present the finite version of Witsenhausen’s intrinsic model, where we highlight the role of the configuration field that contains the information subfields of all agents. In §2.2, we illustrate, on a few examples, the ease with which one can model information in strategic contexts, using subfields of the configuration field. Finally, we present in §2.3 the notions of solvability and causality.

### 2.1 Finite Witsenhausen’s intrinsic model (W-model)

We present the finite version of Witsenhausen’s intrinsic model, introduced some five decades ago in the control community [9, 10].

A finite W-model is a collection , where

• is a finite set, whose elements are called agents;

• is a finite set which represents all uncertainties; any is called a state of Nature; is the complete field over ;

• for any , is a finite set, the set of decisions for agent ; is the complete field over ;

• for any , is a subfield of the following product field

 (1)

and is called the information field of the agent .

 The configuration space is the product space (called hybrid space by Witsenhausen, hence the \HISTORY notation) \HISTORY=\produit∏\agent∈A\CONTROL\agentΩ\eqfinp (2a) As all fields \tribu\NatureField and (\tribu\Control\agent)\agent∈A are complete, the product configuration field \tribu\History=\oproduit⨂\agent∈A\tribu\Control\agent\tribu\NatureField (2b) is also the complete field of \HISTORY. A configuration \history∈\HISTORY is denoted by \history=\bpω,(\control\agent)\agent∈A⟺\history∅=ω and \history\agent=\control\agent\eqsepv∀\agent∈A\eqfinp (2c)

In lieu of the information field  in (1), it will be convenient to consider the equivalence relation , on the configuration space , defined in such a way that the equivalence classes coincide with the atoms of , that is, with the elements of the partition in (33):

 \bp∀\history′,\history′′∈\HISTORY\history′∼\agent\history′′⇔\history′′∈[\history′]\agent⇔∃G∈⟨\tribu\Information\agent⟩,{\history′,\history′′}⊂G\eqfinp (3)

Thus defined, the subset is the unique atom  in that contains the configuration .

We will need the following equivalent characterization of measurable mappings, which is a slight reformulation of [6, Proposition 3.35].

(adapted from [6, Proposition 3.35]) Let be a mapping, where is a set and is a -field over . We suppose that the -field  contains all the singletons. Then, for any agent , the following statements are equivalent:

 ρ−1(\tribuD)⊂\tribu\Information\agent\eqfinv (4a) \bp∀\history′,\history′′∈\HISTORY\history′∼\agent\history′′⟹ρ\np\history′=ρ\np\history′′\eqfinv (4b) (4c)

In any of these equivalent cases, we say that the mapping  is -measurable, and, for all , we denote by the unique element of  in , that is,

 \bp∀r∈ρ\np\HISTORY\eqsepv∀G\agent∈⟨\tribu\Information\agent⟩ρ\npG\agent=r⟺^ρ\npG\agent=\nar\eqfinp (5)

Then, using the extended notation above (3), we have the property

 ρ is \tribu\Information\agent-measurable⟹ρ\np[\history]\agent=ρ\np\history\eqsepv∀\history∈\HISTORY\eqfinp (6)

Now that we have explicited measurable mappings with respect to agents information subfields, we introduce the notion of pure W-strategy.

([9, 10]) A pure W-strategy of agent  is a mapping

 \policy\agent:(\HISTORY,\tribu\History)→(\CONTROL\agent,\tribu\Control\agent) (7a) from configurations to decisions, which is measurable with respect to the information field \tribu\Information\agent of agent \agent, that is, \policy−1\agent(\tribu\Control\agent)⊂\tribu\Information\agent\eqfinp (7b)
 We denote by \POLICY\agent the set of all pure W-strategies of agent \agent∈A. A pure W-strategies profile \policy is a family \policy=(\policy\agent)\agent∈A∈∏\agent∈A\POLICY\agent (8a) of pure W-strategies, one per agent \agent∈A. The set of pure W-strategies profiles is \POLICY=∏\agent∈A\POLICY\agent\eqfinp (8b)

Condition (7b) expresses the property that any (pure) W-strategy of agent  may only depend upon the information  available to the agent.

In what follows, we will need some notations. For any nonempty subset of agents, we define

 \tribu\ControlB =⨂b∈B\tribu\Controlb⊗⨂\agent∉B{∅,\CONTROL\agent}⊂⨂\agent∈A\tribu\Control\agent\eqfinv (9a) \tribu\HistoryB =\tribu\NatureField⊗\tribu\ControlB=\tribu\NatureField⊗⨂b∈B\tribu\Controlb⊗⨂\agent∉B{∅,\CONTROL\agent}⊂\tribu\History\eqfinv (9b) \historyB =(\historyb)b∈B∈∏b∈B\CONTROLb\eqsepv∀\history∈\HISTORY\eqfinv (9c) \policyB =(\policyb)b∈B∈∏b∈B\POLICYb\eqsepv∀\policy∈\POLICY\eqfinp (9d)

### 2.2 Examples

We illustrate, on a few examples, the ease with which one can model information in strategic contexts, using subfields of the configuration field. Even if we have presented the finite version of Witsenhausen’s intrinsic model in §2.1, we take the opportunity here to show its potential to describe infinite decision and Nature sets.

Sequential decisions Suppose an individual has to take decisions (say, an element of ) at every discrete time step in the set333For any integers , denotes the subset . , where is an integer. The situation will be modeled with (possibly) Nature set and field , and with  agents in , and their corresponding sets, , and fields, (the Borel -field of ), for . Then, one builds up the product set and the product field . Every agent is equipped with an information field . Then, we show how we can express four information patterns: sequentiality, memory of past information, memory of past actions, perfect recall. The inclusions , for , express that every agent can remember no more than his past actions (sequentiality); memory of past information is represented by the inclusions , for ; memory of past actions is represented by the inclusions , for ; perfect recall is represented by the inclusions , for .

To represent  players — each  of whom makes a sequence of decisions, one for each period  — we use  agents, labelled by . With obvious notations, the inclusions express memory of one’s own past information, whereas the inclusions , express memory of all players past actions.

Principal-Agent models

A branch of Economics studies so-called Principal-Agent models with two decision makers (agents) — the Principal  (leader) who makes decisions , where the set  is equipped with a -field , and the Agent  (follower) who makes decisions , where the set  is equipped with a -field  — and with Nature, corresponding to private information (or type) of the Agent , taking values in a set , equipped with a -field .

Hidden type (leading to adverse selection or to signaling) is represented by any information structure with the property that, on the one hand,

 \tribu\Information\Principal⊂\oproduit{∅,\CONTROL\Principal}⊗\tribu\Control\Agent\makebox[0% .0pt]{ \Agent's action possibly observed}{∅,Ω}\makebox[0.0pt]{\Agent type not observed}\eqfinv (10)

that is, the Principal  does not know the Agent  type, but can possibly observe the Agent  action, and, on the other hand, that

 \oproduit{∅,\CONTROL\Principal}⊗{∅,\CONTROL\Agent}\tribu\NatureField\makebox[0.0pt]{known % inner type}⊂\tribu\Information\Agent\eqfinv (11)

that is, the Agent  knows the state of nature (his type).

Hidden action (leading to moral hazard) is represented by any information structure with the property that, on the one hand,

 \tribu\Information\Principal⊂\oproduit{∅,\CONTROL\Principal}⊗{∅,\CONTROL\Agent}% \makebox[0.0pt]{                         cannot observe \Agent's action}\tribu\NatureField\makebox[0.0pt]{             % possibly knows \Agent type} \eqfinv (12)

that is, the Principal  does not know the Agent  action, but can possibly observe the Agent  type and, on the other hand, that the inclusion (11) holds true, that is, the agent  knows the state of nature (his type).

Stackelberg leadership model In Stackelberg games, the leader  makes a decision — based at most upon the partial observation of the state of Nature — and the the follower  makes a decision — based at most upon the partial observation of the state of Nature , and upon the leader decision . This kind of information structure is expressed with the following inclusions of fields:

 \tribu\Information\Principal⊂\oproduit{∅,\CONTROL\Principal}⊗{∅,\CONTROL\Agent}\tribu\NatureField\mtextand\tribu\Information\Agent⊂\oproduit\tribu\Control\Principal⊗{∅,\CONTROL\Agent}\tribu\NatureField\eqfinp (13)

Even if the players are called leader and follower, there is no explicit time arrow in (13). It is the information structure that reveals the time arrow. Indeed, if we label the leader  as  (first player) and the follower  as  (second player), the inclusions (13) become the inclusions , and : the sequence of information fields is “adapted” to the filtration . But if we label the leader  as  and the follower  as , the new sequence of information fields would not be “adapted” to the new filtration. It is the information structure that prevents the follower to play first, but that makes possible the leader to play first and the follower to play second.

### 2.3 Solvability and causality

In the Kuhn formulation, Witsenhausen says that “For any combination of policies one can find the corresponding outcome by following the tree along selected branches, and this is an explicit procedure” . In the Witsenhausen formulation, there is no such explicit procedure as, for any combination of policies, there may be none, one or many solutions to the closed-loop equations; these equations express the decision of one agent as the output of his strategy, supplied with Nature outcome and with all agents decisions. This is why Witsenhausen needs a property of solvability, whereas Kuhn does not need it as it is hardcoded in the tree structure. Then, Witsenhausen defines the notion of causality (which parallels that of tree) and proves in  that solvability holds true under causality. Yet, in [9, Theorem 2], Witsenhausen exhibits an example of noncausal W-model that is solvable.

#### 2.3.1 Solvability

With any given pure W-strategies profile we associate the set-valued mapping

 M\policy:Ω ⇉∏b∈A\CONTROLb (14) ω ↦\Bset(\controlb)b∈A∈∏b∈A\CONTROLb\control\agent=\policy\agent\bpω,(\controlb)b∈A\eqsepv∀\agent∈A\eqfinp

With this definition, we slightly reformulate below how Witsenhausen introduced the property of solvability.

([9, 10])

 The solvability property holds true for the W-model of Definition 2.1 when, for any pure W-strategies profile \policy=(\policy\agent)\agent∈A∈∏\agent∈A\POLICY\agent, the set-valued mapping M\policy in (14) is a mapping whose domain is Ω, that is, the cardinal of M\policy\npω is equal to one, for any state of nature ω∈Ω.

Thus, under the solvability property, for any state of nature , there exists one, and only one, decision profile which is a solution of the closed-loop equations

 \control\agent=\policy\agent\bpω,(\controlb)b∈A\eqsepv∀\agent∈A\eqfinp (15a)

In this case, we define the solution map

 M\policy:Ω→∏b∈A\CONTROLb (15b)

as the unique element contained in the image set that is, for all , .

#### 2.3.2 Configuration-orderings

In his articles [9, 10], Witsenhausen introduces a notion of causality that relies on suitable configuration-orderings. Here, we introduce our own notations, because they make possible a compact formulation of the causality property and, later, of perfect recall.

 For any finite set D, let |D| denote the cardinal of D. Thus, |A| denotes the cardinal of the set A, that is, |A| is the number of agents. For k∈\ic1,|A|, let Σk denote the set of k-orderings, that is, injective mappings from \ic1,k to A: Σk=\baκ:\ic1,k→A;κ\mtextisaninjection\eqfinp (16a) The set Σ|A| is the set of total orderings of agents in A, that is, bijective mappings from \ic1,|A| to A (in contrast with partial orderings in Σk for k<|A|). For any k∈\ic1,|A|, any ordering κ∈Σk, and any integer ℓ≤k, κ|{1,…,ℓ} is the restriction of the ordering κ to the first ℓ integers. For any k∈\ic1,|A|, there is a natural mapping ψk ψk:Σ|A|→Σk\eqsepvρ↦ρ|{1,…,k}\eqfinv (16b) which is the restriction of any (total) ordering of A to \ic1,k. We define the set of orderings by Σ=⋃k∈\ic0,|A|Σk where Σ0={∅}\eqfinp (16c) For any k∈\ic1,|A|, and any k-ordering κ∈Σk, we define the range ∥κ∥ of the ordering κ as the subset ∥κ∥ =\baκ(1),…,κ(k)⊂A\eqsepv∀κ∈Σk\eqfinv (16d) the cardinal |κ| of the ordering κ as the integer |κ| =k∈\ic1,|A|\eqsepv∀κ∈Σk\eqfinv (16e) the last element κ⋆ of the ordering κ as the agent κ⋆ =κ(k)∈A\eqsepv∀κ∈Σk\eqfinv (16f) the restriction κ− of the ordering κ to the first k−1 elements κ− =κ|{1,…,k−1}∈Σk−1\eqsepv∀κ∈Σk\eqfinp (16g)

With the notations introduced, any ordering can be written as , with the convention that when .

([9, 10]) A configuration-ordering is a mapping from configurations towards total orderings. With any configuration-ordering , and any ordering , we associate the subset of configurations defined by

 \HISTORYφκ=\ba\history∈\HISTORY;ψ|κ|\bpφ(\history)=κ\eqsepv∀κ∈Σ\eqfinp (17)

By convention, we put . Along each configuration , the agents are ordered by . The set  in (17) contains all the configurations for which the agent  is acting first, the agent  is acting second, …, till the last agent  acting at stage .

#### 2.3.3 Causality

In his article , Witsenhausen introduces a notion of causality and he proves that causal systems are solvable.

The following definition can be interpreted as follows. In a causal W-model, there exists a configuration-ordering with the following property: when an agent is called to play — as he is the last one in an ordering — what he knows cannot depend on decisions made by agents that are not his predecessors (in the range of the ordering under consideration).

([9, 10]) A W-model (as in Definition 2.1) is causal if there exists (at least) one configuration-ordering with the property that

 \HISTORYφκ∩\History∈\tribu\History∥κ−∥\eqsepv∀\History∈\tribu\Informationκ⋆\eqsepv∀κ∈Σ\eqfinp (18)

Otherwise said, once we know the first  agents, the information of the (last) agent  depends at most on the decisions of the (previous) agents in the range . In (18), the subset  of configurations has been defined in (17), the last agent  in (16f), the partial ordering  in (16g), the range  in (16d), and — using the definition (9b) of the subfield  of , with the subset  of agents defined in (16g) and (16d) — the subfield of  is

 \tribu\History∥κ−∥=\tribu\NatureField⊗⨂\agent∈∥κ−∥\tribu\Control\agent⊗⨂b∉∥κ−∥{∅,\CONTROLb}⊂\tribu\History\eqfinp (19)

Witsenhausen’s intrinsic model deals with agents, information and strategies, but not with players and preferences. We now turn to extending the Witsenhausen’s intrinsic model to games.

## 3 Finite games in intrinsic form

We are now ready to embed Witsenhausen’s intrinsic model into game theory. In §3.1, we introduce a formal definition of a finite game in intrinsic form (W-game), and in §3.2 we introduce three notions of “randomization” of pure strategies — mixed, product-mixed and behavioral. In §3.3, we discuss relations between product-mixed and behavioral W-strategies.

In what follows, when is a finite set, we denote by

the set of probability distributions over

. When needed, the set  can be equipped with the Borel topology and the Borel -field, as is homeomorphic to the simplex  of , and is thus homeomorphic to a closed subset of a finite dimensional space.

### 3.1 Definition of a finite game in intrinsic form (W-game)

We introduce a formal definition of a finite game in intrinsic form (W-game).

A finite W-game , or a finite game in intrinsic form, is a made of

• a family , where the set  of players is finite, of two by two disjoint nonempty sets whose union is the set of agents; each subset  is interpreted as the subset of executive agents of the player ,

• a finite W-model , as in Definition 2.1,

• for each player , a preference relation  on the set of mappings .

A finite W-game is said to be solvable (resp. causal) if the underlying W-model is solvable as in Definition 15b (resp. causal as in Definition 2.3.3).

We comment on the preference relations  on the set of mappings . Our definition covers (like in ) the most traditional preference relation , which is the numerical expected utility preference. In this latter, each player  is endowed, on the one hand, with a criterion (payoff), that is, a measurable function , and, on the other hand, with a belief, that is, a probability distribution over the states of Nature . Then, given , , one says that if

 ∫Ων\player\npdω ≤∫Ων\player\npdω∫∏b∈A\CONTROLb\criterion\player\bpω,(\controlb)b∈AK2\bpω,d(\controlb)b∈A\eqfinp

Note also that the Definition 3.1 includes Bayesian games, by specifying a product structure for — where some factors represent types of players, and one factor represents chance — and by considering additional probability distributions.

### 3.2 Mixed, product-mixed and behavioral strategies

We introduce three notions of “randomization” of pure strategies: mixed, product-mixed and behavioral.

The notion of mixed strategy comes from the study of games in normalized form, where each player has to select a pure strategy, the collection of which determines a unique outcome. If we allow the players to select their pure strategy at random, the lottery they use is called a mixed strategy. For an extensive game, a mixed strategy can be interpreted in the following sense. First, the player selects a pure strategy using the lottery. Second, the game is played. When the player is called by the umpire, she plays the action specified by the selected pure strategy for the current information set.

Observe that there is only one dice roll per player. This dice roll determines the reactions of the player for every situation of the game. It would be more natural to let the player roll a dice every time she has to play, leading to the notion of behavioral strategy.

A fundamental question in game theory is to identify settings in which those two views (mixed strategy and behavioral strategy) are equivalent. To formulate this question in the W-game framework, we will give formal definitions of these two notions of randomization. We will also add a third one, that we call product-mixed strategy, and which is in the spirit of Aumann , as each agent (corresponding to time index in ) “generates” strategies from a random device that is independent of all the other agents.

#### 3.2.1 Mixed W-strategies

For any agent , the set  of pure W-strategies for agent  (see Definition 2.1) is finite, hence the set of probability distributions over  is is homeomorphic to , the simplex of , and is thus homeomorphic to a closed subset of a finite dimensional space. So is the space of probability distributions over the set  of pure W-strategies profiles. We will also consider the sets

 \POLICY\player=∏\agent∈A\player\POLICY\agent\eqsepv∀\player∈\PLAYER (20)

of pure W-strategies profiles, player by player, and the set of probability distributions over .

 We consider a finite W-game, as in Definition 3.1. A mixed W-strategy for player \player∈\PLAYER is an element μ\player of Δ\np\POLICY\player, the set of probability distributions over the set \POLICY\player in (20) of W-strategies of the executive agents in A\player. The set of mixed W-strategies profiles is ∏\player∈\PLAYERΔ\np\POLICY\player. A mixed W-strategies profile is denoted by μ=\npμ\player\player∈\PLAYER∈∏\player∈\PLAYERΔ\bp\POLICY\player\eqfinv (21a) and, when we focus on player \player, we write μ=\coupleμ−\playerμ\player∈Δ\bp\POLICY\player×∏\player′≠\playerΔ\bp\POLICY\player′\eqfinp (21b)
 We consider a solvable finite W-game (see Definition 3.1), and μ=\npμ\player\player∈\PLAYER∈∏\player∈\PLAYERΔ\np\POLICY\player a mixed W-strategies profile as in (21a). For any ω∈Ω, we denote by \QQωμ=\QQω\npμ\player\player∈\PLAYER=\bp⨂\player∈\PLAYERμ\player∘\bpM\npω,⋅−1∈Δ\bp∏b∈A\CONTROLb (22a) the pushforward probability, on the space \bp∏b∈A\CONTROLb,⨂b∈A\tribu\Controlb of the product probability distribution ⨂\player∈\PLAYERμ\player on ∏\player∈\PLAYER\POLICY\player=\POLICY by the mapping M\npω,⋅:\POLICY→∏b∈A\CONTROLb\eqsepv\policy↦M\policy(ω)\eqfinv (22b) where M\policy is the solution map (15b), which exists by the solvability assumption.

By (15a), which defines the solution map, and by definition of a pushforward probability, we have, for any configuration ,

 \QQωμ\bp(\controlb)b∈A =\bp⨂\player∈\PLAYERμ\player\BpM\npω,⋅−1\bp(\controlb)b∈A =∏\player∈\PLAYERμ\player\bgp\Bset(\policy\agent)\agent∈A\player∈\POLICY\playerλ\agent\bpω,(\controlb)b∈A=\control\agent\eqsepv∀\agent∈A\player\eqfinp

#### 3.2.2 Product-mixed W-strategies

In a mixed W-strategy, the executive agents of player  can be correlated because the probability  in Definition 21 is a joint probability on the product space . We now introduce product-mixed W-strategies, where the executive agents of player  are independent in the sense that the probability  is the product of individual probabilities, each of them on the individual space  of the strategies of one agent .

We consider a finite W-game, as in Definition 3.1. A product-mixed W-strategy for player  is an element of . The product-mixed W-strategy induces a product probability444By an abuse of notation, we will sometimes write . on the set , which is a mixed W-strategy as in Definition 21.

#### 3.2.3 Behavioral W-strategies

We formalize the intuition of behavioral strategies in W-games by the following definition of behavioral W-strategies.

We consider a finite W-game, as in Definition 3.1. A behavioral W-strategy for player  is a family , where

 β\player\agent:\HISTORY×\tribu\Control\agent→[0,1]\eqsepv\np\history,\Control\agent↦β\player\agent\np\Control\agent|\history (23)

is an -measurable stochastic kernel for each , that is, if one of the two equivalent statements holds true:

1. on the one hand, the function is -measurable, for any and, on the other hand, each is a probability distribution on the finite set , for any ,

2. on the one hand, , for any and, on the other hand, for any , we have , , and .

The equivalences come from the fact that the sets and are finite and equipped with their respective complete fields, and by Proposition 2.1, and especially (4b).

### 3.3 Relations between product-mixed and behavioral W-strategies

Here, we show that product-mixed and behavioral W-strategies are “equivalent” in the sense that a product-mixed W-strategy naturally induces a behavioral W-strategy, and that a behavioral W-strategy can be “realized” as a product-mixed W-strategy (see Figure 1).

#### From product-mixed to behavioral W-strategies

We prove that a product-mixed W-strategy naturally induces a behavioral W-strategy.

We consider a finite W-game, as in Definition 3.1, and a player .

For any product-mixed W-strategy , as in Definition 3.2.2, we define, for any agent ,

 ^π\player\agent\np\na\control\agent|\history=π\player\agent\Bp\ba\policy\agent∈\POLICY\agent;\policy\agent\np\history=\control\agent\eqsepv∀\control\agent∈\CONTROL\agent\eqsepv∀\history∈\HISTORY\eqfinp (24)

Then, is a behavioral W-strategy, as in Definition 3.2.3.

###### Proof.

Let be given a product-mixed W-strategy .

To prove that (24) defines a behavioral W-strategy, we have to show (see Item 1 in Definition 3.2.3), on the one hand, that the function is -measurable, for any and, on the other hand, that each is a probability on the finite set , for any . For this purpose, we will use the more practical characterization of Item 2 in Definition 3.2.3.

Let us fix . Let be such that , where we recall that the classes of the equivalence relation  in (3) are exactly the atoms in . By (6), we have that . Therefore, from the expression (24), we have obtained that , hence that the function is -measurable, by Proposition 2.1, and especially (4b).

By the expression (24), we have that , and, since is a probability on , that .

This ends the proof. ∎

#### From behavioral to product-mixed W-strategies

We prove that a behavioral W-strategy can be “realized” as a product-mixed W-strategy.

We consider a finite W-game, as in Definition 3.1, and a player .

For any behavioral W-strategy , as in Definition 3.2.3, there exists a product-mixed W-strategy , as in Definition 3.2.2, with the property that, for any agent  in we have

 ˇβ\player\agent\Bp\ba\policy\agent∈\POLICY\agent;\policy\agent\np\history=\control\agent=β\player\agent\np\na\control\agent|\history\eqsepv∀\control\agent∈\CONTROL\agent\eqsepv∀\history∈\HISTORY\eqfinp (25)
###### Proof.

We consider a fixed agent .

On the one hand, by Proposition 2.1, we get that

 \policy\agent∈\POLICY\agent⟺∃(\controlG\agent\agent)G\agent∈⟨\tribu\Information\agent⟩∈\CONTROL⟨\tribu\Information\agent⟩\agent\eqsepv\policy\agent\np\history=\controlG\agent\agent\eqsepv∀G\agent∈⟨\tribu\Information\agent⟩\eqsepv∀\history