# Comparison of Information Structures for Zero-Sum Games and a Partial Converse to Blackwell Ordering in Standard Borel Spaces

In statistical decision theory involving a single decision-maker, one says that an information structure is better than another one if for any cost function involving a hidden state variable and an action variable which is restricted to be only a function of some measurement, the solution value under the former is not worse than the value under the latter. For finite probability spaces, Blackwell's celebrated theorem on comparison of information structures leads to a complete characterization on when one information structure is better than another. For stochastic games with incomplete information, due to the presence of competition among decision makers, in general such an ordering is not possible since additional information can lead to equilibria perturbations with positive or negative values to a player. However, for zero-sum games in a finite probability space, Pęski introduced a complete characterization of ordering of information structures. In this paper, we obtain an infinite dimensional (standard Borel) generalization of Pęski's result. A corollary of our analysis is that more information cannot hurt a decision maker taking part in a zero-sum game in standard Borel spaces. During our analysis, we establish two novel supporting results: (i) a partial converse to Blackwell's ordering of information structures in the standard Borel space setup and (ii) a refined existence result for equilibria in zero-sum games with incomplete information when compared with the prior literature.

## Authors

• 1 publication
• 5 publications
• ### Zero-sum Stochastic Games with Asymmetric Information

A general model for zero-sum stochastic games with asymmetric informatio...
09/03/2019 ∙ by Dhruva Kartik, et al. ∙ 0

• ### Structure in the Value Function of Two-Player Zero-Sum Games of Incomplete Information

Zero-sum stochastic games provide a rich model for competitive decision ...
06/22/2016 ∙ by Auke J. Wiggers, et al. ∙ 0

• ### Fast Planning in Stochastic Games

Stochastic games generalize Markov decision processes (MDPs) to a multia...
01/16/2013 ∙ by Michael Kearns, et al. ∙ 0

• ### Convergence of Learning Dynamics in Stackelberg Games

This paper investigates the convergence of learning dynamics in Stackelb...
06/04/2019 ∙ by Tanner Fiez, et al. ∙ 0

• ### On Bellman's Optimality Principle for zs-POSGs

Many non-trivial sequential decision-making problems are efficiently sol...
06/29/2020 ∙ by Olivier Buffet, et al. ∙ 0

• ### Actor-Critic Algorithms for Learning Nash Equilibria in N-player General-Sum Games

We consider the problem of finding stationary Nash equilibria (NE) in a ...
01/08/2014 ∙ by H. L Prasad, et al. ∙ 0

• ### Multiagent Evaluation under Incomplete Information

This paper investigates the evaluation of learned multiagent strategies ...
09/21/2019 ∙ by Mark Rowland, et al. ∙ 24

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Characterizing the value of information structures is a problem in many disciplines involving decision making under uncertainty. In stochastic control theory, it is well-known that more information cannot hurt a given decision maker since the decision maker can always choose to ignore this information. In statistical decision theory involving a single decision maker, one says that an information structure is better than another one if for any given measurable and bounded cost function involving a hidden state variable and an action variable which is restricted to be only a function of some measurement, the solution value obtained under optimal policies under the former is not worse than the value obtained under the latter. For finite probability spaces, Blackwell’s celebrated theorem [6] on the ordering of information structures obtains a precise characterization of when an information structure is better. This finding has inspired much further research as reviewed in e.g. [10, 30].

Since Blackwell’s seminal 1953 paper [6], significant work has been done to extend Blackwell’s results to team problems and games. Stochastic team problems (known also as identical interest games) were studied in a finite-space setting by Lehrer, Rosenberg, and Shmaya [13]; see also [40, Chapter 4]. However, for games, in general, such an ordering is not possible since additional information leads to a perturbation which is not necessarily monotone due to the presence of competitive equilibrium, where unlike in a team setup, information can have both positive and negative value to a player in a general game. Some of the earlier accounts on such phenomena are [18] and [2], where the latter studied the comparison of information structures for team-like (LQG) and zero-sum like (quadratic duopoly) games.

As noted above, for general non-zero sum game problems, informational aspects are very challenging to address and more information can hurt some or even all of the players in a system, see e.g. [18, 17, 20, 1]. To make this discussion more concrete, we provide the following example due to Bassan et al. [5].

###### Example 1.1

Consider a card drawn at random from a deck, where its colour can be either red or black, each with probability . Player 1 first declares his guess of the colour, and then, after hearing what Player 1 guessed, Player 2 submits her guess for the colour. If both players guess the same colour, the payout is $2 each, whereas if one player guesses correctly, that player receives a payout of$6 and the other player receives $0. In the case where both players are uninformed about the colour of the card, the expected payout is$3 each, as Player 1’s optimal strategy is arbitrary, and Player 2’s optimal strategy is to guess the opposite colour of what Player 1’s guessed.

In the case where both players are informed of the colour of the card prior to declaring their guess, the equilibrium for the game occurs when both players guess the true colour of the card. In this case, the expected payout becomes \$2 for each player.

Bassan et al. [5] further provided sufficient conditions for games to have the ‘positive value of information property’, where providing additional information to some or all players results in greater or equal payoffs for all players [5]. Gossner and Mertens highlighted zero-sum games as a particularly interesting class to study in the context of ordering information structures in games and did preliminary work on this ordering [16]; zero-sum games provide a worthwhile class of games to study due to the fact that, under mild conditions, every game has a value (achieved at a saddle point).

For comparison of information structures in zero-sum games with finite measurement and action spaces, Pȩski provided necessary and sufficient conditions, and thus a complete characterization [29]. Prior to Pȩski’s results, De Meyer, Lehrer, and Rosenberg had shown that the value of information is positive in zero-sum games, albeit with a slightly different setup than Pȩski, where their payoff depended on an individual ‘type’ for each player rather than a common state of nature; their results were applicable for infinite action spaces and finite type spaces [25]. Furthermore, Lehrer and Shmaya studied a ‘malevolent nature’ zero-sum game played between nature and a player in a finite setting, and characterized a partial ordering of information structures for these games [24].

In this paper, we generalize Pęski’s results to a broad class of zero-sum games with standard Borel measurement and action spaces: we recall that a metric space which is complete and separable is called a Polish space, and a Borel subset of a Polish space is called a standard Borel

space. Finite dimensional real vector spaces are important examples of such spaces.

Toward this goal, additional supporting results, which may be of independent interest, are obtained: sufficient conditions are presented (i) for the existence of saddle-point equilibria in zero-sum games with incomplete information and (ii) for a partial converse to Blackwell’s ordering when the player has standard Borel measurement and action spaces and the unknown variable also takes values from a standard Borel space.

## 2 A review of prior results and contributions

### 2.1 Comparison of information structures in single-agent problems

Let be an

-valued random variable with

being a standard Borel space. We call the state of nature; is known by the decision maker but is not. Recall that a standard Borel space is a Borel subset of a complete, separable, metric (Polish) space. Let , our measurement space, be another standard Borel space and be -valued, defined with

 y=g(x,v)

for some independent noise variable (which, without any loss, can be taken to be -valued). In the above, we can view as inducing a measurement channel , which is a stochastic kernel or a regular conditional probability measure from to in the sense that is a probability measure on the (Borel) -algebra on for every , and is a Borel measurable function for every .

Given a fixed , , and , a single player decision problem is a pair of a cost function and an action space .

Using stochastic realization results (see Lemma 1.2 in [15], or Lemma 3.1 of [9]), it follows that the functional representation in is equivalent to a stochastic kernel description of an information structure, since for every , one can define and a -valued random function so that the representation holds almost surely.

Let denote the set of all probability measures on (the Borel sigma field over) . For and kernel , we let

denote the joint distribution induced on

by channel with input distribution :

 PQ(A)=∫AQ(dy|x)P(dx),A∈B(X×Y).

Now, let the objective be one of minimization of the cost

 J(P,Q,γ)=EQ,γP[c(x,u)], (1)

over the set of all admissible measurable policies with , where is a Borel measurable cost function and denotes the expectation with initial state probability measure given by , under policy , and given channel .

The comparison question is the following: when can one compare two measurement channels such that

 infγ∈ΓJ(P,Q1,γ)≤infγ∈ΓJ(P,Q2,γ),

for a large class of single-player decision problems in (1)?

We now recall the notion of garbling. We note that garbling is sometimes defined to be equivalent to physical degradedness of communication channels (as opposed to stochastic degradedness) [11], however in this paper we will take stochastic degradedness and garbling to be equivalent.

###### Definition 2.1

An information structure induced by some channel is garbled (or stochastically degraded) with respect to another one, , if there exists a channel on such that

 Q2(B|x)=∫YQ′(B|y)Q1(dy|x),B∈B(Y),Pa.s.x∈X.

We also define the following notion:

###### Definition 2.2

 infγ∈ΓEΦ,γP[c(x,u)]≥infγ∈ΓEμ,γP[c(x,u)]

for all single player decision problems .

In view of Proposition 2.1, we have the following.

We emphasize that in Definition 2.2, is also a design variable for the decision problem. For instance, if were a singleton, then the comparison of information structures would be meaningless. With this in mind, we state Blackwell’s classical result in the following.

###### Theorem 2.1

[Blackwell [6]] Let be finite spaces. The following are equivalent:

• is weakly stochastically degraded with respect to (that is, a garbling of ).

• The information structure induced by channel is more informative than the one induced by channel for all single player decision problems with finite .

That (i) implies (ii) follows from the next result, which is an immediate finding in statistical decision theory, and Jensen’s inequality.

###### Proposition 2.1

The function

 V(P):=infu∈U∫c(x,u)P(dx),

is concave in , under the assumption that is measurable and bounded.

For a proof of this proposition see [40, Theorem 4.3.1].

The converse, ii) implies i), is significantly more challenging. For the case with general spaces, related results are attributed to [7], and [28], [35], which relate an ordering of information structures in terms of dilations and their relation with comparisons under concave functions defined on conditional probability measures. A very concise yet informative review is in [10, p. 130-131] and a more comprehensive review is in [36]. We will present a direct proof that will be utilized in our main result of the paper and present a comparative discussion.

### 2.2 Comparison of information structures in zero-sum game problems

Now, consider a zero-sum game generalization of the problem above, with two decision makers.

Consider a two-agent setup as follows.

 yi = gi(x,vi),i=1,2,

where the noise variables and are independent. Suppose that induces a channel for as described earlier and DM has only access to . Let denote the measurable policies of the agents.

Given fixed , , , and such that , a game is a triple of a measurable and bounded cost function and action spaces for each player .

We will impose one of the following conditions on the information structures. We note that Assumption 2.2 implies Assumption 2.1, but this assumption often allows for a simpler interpretation. That this implication holds is a consequence of the independent measurements reduction formulation to be explained in detail later in the paper (see Theorem 3.1. The results will be presented under the more general Assumption 2.1).

###### Assumption 2.1

The information structure is absolutely continuous with respect to a product measure:

 P(dy1,dy2,dx)≪¯Q1(dy1)¯Q2(dy2)P(dx),

for reference probability measures , . That is, there exists an integrable which satisfies for every Borel

 P(y1∈B,y2∈C,x∈A)=∫A,B,Cf(x,y1,y2)P(dx)¯Q1(dy1)¯Q2(dy2)
###### Assumption 2.2

The following conditional independence (or Markov) condition holds:

 P(dy1,dy2,dx)=Q1(dy1|x)Q2(dy2|x)P(dx)

where the measurements of agents are absolutely continuous so that for , there exists a non-negative function and a reference probability measure such that for all Borel :

 Qi(yi∈S|x)=∫Sfi(yi,x)¯Qi(dyi)

Let the joint measure define the information structure for the game and let us denote this with . For a zero-sum game with the conditional independence assumption in Assumption 2.1, an information structure consists of private information structures and defined with . Define as the joint probability measure induced on by measurement channel with input distribution . For our analysis we will also allow policies to be randomized with independent randomness. Under conditional independence, let us define the following cost functional for a single-stage setup:

 J(P,μ1,γ1,μ2,γ2)=EQ1,Q2,γ––P[c(x,u1,u2)] =∫X×Yc(x,γ1(y1),γ2(y2))Q1(dy1|x)Q2(dy2|x)P(dx)

Suppose that DM (the minimizer) wishes to minimize the cost and DM (the maximizer) wishes to maximize the cost. Let and be defined as earlier for each decision maker.

###### Definition 2.3

Given an information structure , we say that is an equilibrium for the zero-sum game if

 infγ1∈Γ1J(P,μ1,γ1,μ2,γ2,∗) = J(P,μ1,γ1,∗,μ2,γ2,∗) = supγ2∈Γ2J(P,μ1,γ1,∗,μ2,γ2).

Let be the expected value of the cost function for the maximizer, for some game , given information structure and strategies for the minimizer and maximizer, respectively:

 VμG(γ1,γ2)=∫c(x,γ1(y1),γ2(y2))Q1(dy1|x)Q2(dy2|x)P(dx)

Let be where are chosen to be the equilibrium strategies for the players.

###### Definition 2.4

For fixed and such that , we say that an information structure is better for the maximizer than information structure (written as ) over all games in a class of games if and only if for all games in :

 V∗(G,μ)≥V∗(G,Φ)
###### Definition 2.5

We denote by the information structure in which player ’s information from is garbled by a stochastic kernel . Explicitly, this means the information structure becomes:

 (κiμ)(B,dy−i,dx)=∫Yiκi(B|yi)μ(dyi,dy−i,dx),B∈B(Yi)

We use to denote the space of all such stochastic kernels for player .

###### Theorem 2.2 (Pęski [29])

Let be finite. For any two information structures and , is better for the maximizer than over all games with finite action spaces if and only if there exist kernels , such that

 κminψ =κmaxμ,

In particular, under Assumption 2.2, we have the more explicit characterization with

 κminQ1Φ=Q1μandQ2Φ=κmaxQ2μ.

Where and are the measurement channels for player under information structures and , respectively.

In this paper we will obtain a standard Borel generalization of this result.

### 2.3 Team Theoretic Setup

For completeness, we also discuss the team theoretic setup in our review. It follows from [22] and [40, Chapter 4] that Blackwell’s ordering, with a measurement-independent randomization (though which can be common across the decision makers), leads to an ordering of information structures for static stochastic team problems.

### 2.4 Contributions

In this paper, we will derive a standard Borel counterpart of Theorem 2.2. While obtaining our results, we will also derive conditions for the existence of saddle points in Bayesian zero-sum games in standard Borel spaces, as well as a converse theorem to Blackwell’s ordering of information structures in the infinite setup.

The contributions of this paper are as follows:

• We will derive a standard Borel counterpart of Theorem 2.2 characterizing an ordering of information structures for zero-sum games (Theorem 4.1).

• We present two supporting results: (a) As a minor technical contribution, we present sufficient conditions for the existence of saddle points in Bayesian zero-sum games with incomplete information in standard Borel spaces (Theorem 3.1). This will build on placing an appropriate topology on the space of policies adopted by the decision makers. Our analysis generalizes existing results in the literature, notably [26] and [23], though as we note in the paper our generalization is rather technical and the conditions in [26]

are nearly equivalent to ours. (b) As a further supporting theorem, we will present a partial converse to Blackwell’s ordering theorem for standard Borel spaces, using a separating hyperplane argument and properties of locally convex spaces (Theorem

3.2). This presents an explicit, self-sufficient derivation for a converse theorem to be utilized in our main theorem, though related comprehensive results have been reported in the literature, as we note in the paper.

## 3 Supporting Results on Existence of Saddle-Points and Comparison for Zero-Sum Games with Standard Borel Spaces

### 3.1 On Existence of Saddle-Points and Equilibria

Prior to focusing in on the ordering of information structures, we present a supporting result regarding when equilibrium solutions to zero-sum games exist. In the finite case, equilibrium solutions always exist [37] (through e.g. [3, Theorem 4.4]), but this does not hold true in general [27]. Theorem 3.1 below gives sufficient conditions for equilibrium solutions to exist for games with incomplete information.

The existence of a value for games with incomplete information has been studied rather extensively. For readers’ convenience, and as a direct proof, we present the result below; to our knowledge our statement and conditions have not been stated in the prior literature, though results nearly equivalent to ours have been noted rather indirectly: Most notably, [26, Theorem 1] presents an existence result for more general games, and presents conditions whose generality is difficult to interpret: a careful look at condition R1 in [26, p. 625] leads to the conclusion that the authors have nearly (but not exactly) the same condition (ii) we note below; that is continuity of the cost function in the actions for every fixed hidden state variable is sufficient, though the statements given in [26] imposes conditions that are not conclusive on this; we attribute this to the fact that the authors utilize [26, Prop. 1(c)] without establishing its relation to item (ii) below (due to the measurability requirement in the statement of [26, Prop. 1(c)]). Our analysis affords the simplicity and generality in the condition, since we build on the - topology, rather than weak topology and directly Lusin’s theorem [12] as followed in [26] (we also note that the relation between weak and - topologies on probabilities defined on product spaces with a fixed marginal can in fact be established using Lusin’s theorem). Hence, in a strict sense, our conditions are more direct and general as stated.

The comprehensive book [23, Proposition III.4.2.] imposes continuity in all the variables (unlike what is presented below). Furthermore, [23, Proposition III.4.2.] builds on a topology construction on policies which is different from what we present here (we note that in control theory a related construction has been utilized in [8]; see also [32]); the construction in [23] is rather abstract and we would like to caution that in the absence of absolute continuity conditions on the information structure, this construction may lead to a lack of closedness on the sets of admissible policies (or strategic measures) as the counterexample [42, Theorem 2.7] reveals: in this counterexample, which would reduce to the setup studied here with , a sequence of policies is constructed so that for each element of the sequence the action variables of the two decision makers are conditionally independent given their measurements, but the setwise (and hence, weak) limit of the sequence is not conditionally, or otherwise, independent; and thus the limit measure does not belong to the original information structure.

###### Theorem 3.1 (Existence of Equilibria)

For a given game, assume that Assumption 2.1 holds. Further, let the following hold.

1. [nosep]

2. The action spaces of players, , are compact.

3. The cost function is bounded and continuous in players’ actions, for every state of nature .

Then an equilibrium exists under possibly randomized policies, and so there exists a value of the zero-sum game.

Proof. Step (1): By Assumption 2.1, we can reformulate the problem in a new probability space in which the measurements are independent from the unknown variable . This reformulation, called an independent-measurements reduction, is essentially due to Witsenhausen [38], with a detailed discussion in [39, Section 2.2.], see Figure 1.

The main benefit of this approach is to define a compact/convex policy space for the players (e.g. see [42, Section 2.2]). To complete this reformulation, we note the following holds for some function and reference probability measures :

 P(dx,dy1,dy2,du1,du2)=ζ(dx)f(x,y1,y2)¯Q1(dy1)1{γ1(y1)∈du1}¯Q2(dy2)1{γ2(y2)∈du2)}

where is the indicator function. Thus, the value function for the game can be written as:

 VμG(γ1,γ2)=∫f(x,y1,y2)c(x,u1,u2)¯Q1(dy1)¯Q2(dy2)ζ(dx)

We then create a new cost function .

Step (2): Let be the random state of nature. Let be the policies for the players, and be the resulting actions chosen by the players. We allow for policies where is chosen in a random way, i.e. , where is some -valued independent random variable (we note that any randomized policy, defined as a stochastic kernel from to admits such a stochastic realization; see [15, Lemma 1.2], or [9, Lemma 3.1]).

Step (3): Let be the reformulated cost function of this game, under the new product probability measure, we have:

 VμG(γ1,γ2)=∫c(x,u1,u2,y1,y2)(¯Q1γ1)(dy1,du1)(¯Q2γ2)(dy2,du2)ζ(dx)

Here, and are the probability measures induced on the measurement and the action variables. By independence due to the reduction, we can consider the expected cost as a function of the reduced-form policies: . Now, without loss of generality, we fix , allowing us to express the above equation in the following form:

 VμG(¯Q1γ1,¯Q2γ2)=∫(¯Q2γ2)(dy2,du2)∫c(x,u1,u2,y1,y2)(¯Q1γ1)(dy1,du1)ζ(dx)

Let be defined as .

Now that we have an independent-measurements reduction, we will (similar to the analysis from [26, 9, 42]), identify, almost surely, every admissible policy with a probability measure on the product space: we adopt the view that, given game , is a probability measure on with fixed marginal on . Let denote the space of all such measures since every can be identified with an element in almost surely. The pairing of an information structure and a policy induces a probability measure on the five-tuple: , with

 P(dx,dy1,dy2,du1,du2)=γ1(du1|y1)γ2(du2|y2)Q1(dy1|x)Q2(dy2|x)ζ(dx).

This construction allows us to obtain a proper topology to work with for spaces of policies with desirable convexity and compactness properties.

We now recall the - topology [33] on the set of probability measures ; this is the coarsest topology under which is continuous for every measurable and bounded which is continuous in for every (but unlike weak topology, does not need to be continuous in ). Now, since the exogenous variables are fixed, weak convergence in this setting is equivalent to - convergence (see [39]), and continuity in the exogenous variable is not needed here. Consider a sequence of actions which converges to weakly. We have that is continuous in . Since is fixed, the marginals on are fixed. Therefore, by [33, Theorem 3.10] (or [4, Theorem 2.5]), we can use the - topology on the set of probability measures . And so we have continuity of in in the - topology and, by the equivalence in this setting, the weak topology.

This also holds for continuity in in the reverse case where we fix . Therefore, in general, we have that is continuous in when is fixed.

Step (4): Let be our reduced policy space, where is the fixed marginal of the measure on . Following from [42, Section 2.1], the space of all (which we denote by ) is compact under weak convergence.

Step (5): We observe that is linear and hence is both concave and convex in each entry. For completeness, we establish this linearity result. Take . Then, without loss of generality, we fix and obtain the following:

 VμG(¯Q1γ1,θ¯Q2γ2+(1−θ)~Q2~γ2) = ∫(θ¯Q2γ2+(1−θ)~Q2~γ2)(dy2,du2)∫y2,u2¯c(x,u2,y2) = ∫(θ¯Q2γ2)(dy2,du2)∫y2,u2¯c(x,u2,y2)+∫(1−θ)(~Q2~γ2)(dy2,du2)∫y2,u2¯c(x,u2,y2) = θVμG(¯Q1γ1,¯Q2γ2)+(1−θ)VμG(¯Q1γ1,~Q2~γ2)

Lastly, we recall that, under the weak topology, the space of probability measures is a metric space, and thus our spaces are Hausdorff spaces.

Since is continuous, and convex/concave in the compact Hausdorff spaces , we have the following equality [14, Theorem 1]:

 minQ1γ1maxQ2γ2VμG(Q1γ1,Q2γ2)=maxQ2γ2minQ1γ1VμG(Q1γ1,Q2γ2)

This establishes a (saddle-point) equilibrium for the game.

Thus, we have obtained an existence result for the value of the games considered, and also provided an approach to topologize and convexify/compactify the policy spaces.

### 3.2 On a Partial Converse to Blackwell Ordering in the Standard Borel Setup

In addition to requiring conditions for the existence of equilibrium solutions in the infinite case, we need to address the extension of Blackwell’s ordering of information structures to the infinite case, as this will form a key aspect of the proof of the main result of this paper, Theorem 4.1.

Here, we present a partial converse to Blackwell’s theorem.

The forward direction to Blackwell’s theorem holds in the infinite case (see [40, Theorem 4.3.2]), i.e. when , are standard Borel spaces for a single-player setup, being a weakly stochastically degraded version of implies that is more informative than over all single-player decision problems with standard Borel action spaces and bounded cost functions that are continuous in the player’s action for every state of nature.

As noted earlier, related results were presented by C. Boll in 1955 in an unpublished thesis paper [7]. Le Cam presents a summary of these results in [10], with a detailed review reported in [36]. The approach in the literature often builds on the construction of dilations of conditional probability measures, which is related to Blackwell’s comparison of experiments theorem through what is known as the Blackwell-Sherman-Stein theorem. A detailed comparative analysis is provided further below. Our main contribution here is an explicit converse compatible with the conditions on existence results presented in the previous section and a comparison to be presented in the next section. This result serves as a supporting step with a direct proof; the results reported in the literature are often very technical and the explicit implication for our setup is not evident a priori as we discuss in the next subsection.

###### Theorem 3.2

Let us consider a single player whose goal is to minimize the value of the cost function for a set of single-player decision problems. We assume the measurement is absolutely continuous in the following sense: there exists a function and a reference probability measure such that for all Borel :

 P(y∈S|x)=∫Sf(y,x)¯Q(dy)

If is compact and an information structure is more informative than another information structure over all single-player decision problems with compact standard Borel action spaces and bounded cost functions that are continuous in for every , then must be a garbling of in the sense of Definition 2.1.

Proof. We note that under the conditions of the theorem, an optimal policy (which is also deterministic) exists for every information structure (see Theorem 3.1 in [41]).

Step (1): Let

be the fixed probability distribution on

for any given decision problem in our set. Take information structures , where is more informative than in Blackwell’s sense (i.e. over all games with bounded cost functions that are continuous in ).

Take the space , a subset of , to be the space of all possible garblings of , where the garblings are from to .

Step (2): We now establish the weak compactness of the space of all garbled information structures. First, observe that the set of all induced garblings on the product space (involving all of ) inducing probability measures of the form

 PK(dx,dy,d~y):=μ(dx,dy)K(d~y|y)

leads to a weakly pre-compact space in the space of probability measures on . If closedness can also be established, this would lead to a weakly compact space. This follows from the proof of [39, Theorem 5.6]: since the marginals on are fixed, any limit of a weakly converging limit will also satisfy the property that the limit is a garbling of the original information structure. For completeness, we present the following: Let and consider a weakly converging sequence . We will show that the weak limit also admits such a garbled structure. Let converge weakly to . Then, for every continuous and bounded

 ∫g(x,y,~y)PK(dx,dy,d~y)=∫(∫g(x,y,~y)μ(dx|dy))PKn(dy,d~y)

Since the marginal on is fixed, even though the function is only measurable and bounded in and is continuous in , - convergence is equivalent to the weak convergence of and as a result we have that

 ∫(∫g(x,y,~y)μ(dx|dy))PKn(dy,d~y)→∫(∫g(x,y,~y)μ(dx|dy))P(dy,d~y)

As a result, decomposes as for some . This establishes the weak compactness of the garbled information structure in the product space .

Now, take the projection of this space onto the measures on the first and the third coordinate; as a continuous image of a weakly compact set, this map will also be compact and gives us our space .

Finally, is convex, since the space of stochastic kernels is convex. As a result, the space of all possible garblings of is a convex and compact subset of under the weak convergence topology.

Now, assume there does not exist a stochastic kernel such that:

 Φ=κμ

Which is to say, we assume is not a garbling of and proceed with a proof by contradiction. Then, . That is, .
Step (3): We now use the Hahn-Banach Separation Theorem for Locally Convex Spaces by treating the space of probability measures as a locally convex space of measures (see [31, Theorem 3.4]). As such, since our spaces and are subsets of this space and are convex, closed and compact, in addition to being disjoint, we can separate them using a continuous linear map from ) to .

To apply [31, Theorem 3.4], we require local convexity of , and so we define the locally convex space of probability measures with the following notion of convergence: We say that if for every measurable and bounded function which is continuous in for every . We note that our measures must still have fixed marginal on .

Since continuous and bounded functions separate probability measures (in the sense that, if the integrations of two measures with respect to continuous functions are equal, the measures must be equal), it follows from [31, Theorem 3.10] that we can represent every continuous linear map on using the form:

 F=∫f(x,y)Φ(dx,dy),

for some measurable and bounded function continuous in for every . It also follows from [31, Theorem 3.10] that, given this notion of convergence, is a locally convex space.

Therefore, we have the following statement from combining [31, Theorem 3.4] and [31, Theorem 3.10]: there exists a measurable and bounded function (continuous in ) and constants where such that:

 ⟨Φ,f⟩≤D1,⟨κμ,f⟩≥D2,∀κ∈K

Where we use the following notation:

 ⟨Φ,f⟩=∫X×Yf(x,y)Φ(dx,dy)

This gives us the following inequality: .

Step (4): Now consider the class of decision problems with bounded cost functions continuous in the actions, with compact , , where . This is clearly a subset of all decision problems considered so far in the proof. Now let be the separating function found above. Consider a game in this particular subclass where is the cost function (which is valid since and is bounded continuous in ). We note that gives the expected value of the game with cost function under information structure when the player plays the identity policy . We now observe the following:

 ∫X×Yf(x,y)Φ(dx,dy)< ∫X×Yf(x,y)κμ(dx,dy),∀κ∈K

and hence,

 ≤ infκ∈K∫X×Yf(x,y)κμ(dx,dy) = infκ∈K∫X×Yf(x,y′)∫Yκ(dy′|y)μ(dx,y) = infκ∈K∫X×Yf(x,κ(y))μ(dx,dy)

But, since we allow for randomized policies, and is the space of all stochastic maps from to , this minimization over is equivalent to finding the optimal policy for the cost function under information structure . And so we have:

 ∫X×Yf(x,y)Φ(dx,dy)=J(P,Φ,γid)

Since we have found a game where, when playing its optimal policy, performs worse than does under some policy, we have contradicted the fact that is better than . Therefore, there must exist a such that , and so is a garbling of .

This result will allow us to use both directions of Blackwell’s ordering of information structures in the standard Borel-type setup we are considering for players in zero-sum games.

Dilations as Measures for Comparisons of Experiments and Strassen’s Theorem. Strassen, in [35, Theorem 2], presents a related result that is often attributed to when comparison of experiments is studied in infinite dimensional probability spaces, although the direct implication on Blackwell’s ordering (in the sense needed in our main result to be presented in the next section) is not explicit as we note in the following. Likewise, [28] relates an ordering of information structures in terms of dilations (where the hidden variable does not appear explicitly in the analysis). A very concise yet informative review is in [10, p. 130-131] and a more comprehensive review is in [36].

A detailed discussion on comparisons of information structures along the same approach is present in the comprehensive book [36]. Both for completeness as well as to compare the findings, we present a discussion in this subsection.

Let be a convex compact metrizable subset of a locally convex topological vector space. For Borel probability measures and write if and only if for all

 ∫ydμ≥∫ydν.
###### Theorem 3.3

[35, Theorem 2] if and only if there is a dilation P such that , where a dilation is a Markov kernel from to such that for all continuous affine functions on , .

It is not immediate whether this theorem leads to a converse to Blackwell’s theorem in the general space setup that we have considered in the previous section. Let us discuss the steps in the following: Let be the space . Let be an information structure that is more informative than another information structure in Blackwell’s sense. Let us restrict ourselves to decision problems where is compact. Let and be the measurement channels for the player under information structures and , respectively. By definition, we have for all measurable and bounded cost functions continuous in the actions:

 infγ∈Γ∫P(dx)Qμ(dy|x)c(x,γ(y))≤infη∈Γ∫P(dx)QΦ(dy|x)c(x,η(y))

We can rewrite this as:

 ∫Pμ(dy)(infu∈U∫Pμ(dx|y)c(x,u))≤∫PΦ(dy)(infu∈U∫PΦ(dx|y)c(x,u))

where and are probability distributions on induced by the respective information structures.

Let . Then we can rewrite this once again as:

 ∫Pμ(dπ)W∗(π)≥∫PΦ(dπ)W∗(π)

Since and give probability distributions on , and is a function over , we will have in Strassen’s sense if the above inequality holds for all continuous and concave functions over .

We can show that is continuous and concave in : Let weakly. Let be optimal for . Then:

 |∫c(x,u∗n)πn(dx)−∫c(x,u∗)π(dx)| ≤max(∫c(x,u∗n)(πn(dx)−π(dx)),∫c(x,u∗)(π)n(dx)−π(dx))

We note that goes to weakly following an argument in [34, Theorem 3.5] or [21, Theorem 3.5], and since weakly converges to , the first term converges to zero. The second term converges to zero by the weak convergence of to .

Therefore, we have continuity of . Concavity of in the conditional measure follows from Proposition 2.1. Now, if one can show that by using all bounded continuous (and, if needed, measurable only in the actions, as studied earlier) cost functions and compact action spaces , the space of all continuous and concave functions on is spanned by the space of all functions, then a converse can be attained through Strassen’s result with some additional work. However, this is not immediate. This discussion motivated our self-sufficient analysis presented in the previous section.

###### Remark 3.1

We note that the problem above of the span of functions has a natural interpretation in the context of games. We observe that is the expected cost for the player given that the player’s observation is . Thus, we can re-frame our above question as follows: given some function , an information structure , and state/action spaces and , does there exist some continuous and bounded cost function such that ? If this is true for any continuous and concave with arbitrary , , and , then the relationship between Strassen’s result and the converse to Blackwell becomes clear.

The answer to this general problem, however, is negative: Consider a game with , , an arbitrary distribution on , and an information structure such that the player’s measurement is given by for all , where

is a uniformly distributed random variable on

; i.e., the player only measures noise. In this case, the player’s optimal strategy for any game will be independent of his measurement, since the measurement is completely irrelevant to the cost. Therefore, the expected cost will be constant for all for any cost function and action space, and thus cannot span all continuous and concave .

Thus, we conclude that Strassen’s theorem does not imply a converse to Blackwell’s ordering of information structures in standard Borel spaces.

## 4 Comparison of Information Structures for Zero-Sum standard Borel Bayesian Games

We are now prepared to order information structures in the spirit of Theorem 2.2 for this standard Borel setup. We note that the following lemmas, theorem, and corollary also hold in the general finite case studied by Pęski, as they rely solely on the existence of equilibria (which are guaranteed to exist in the finite setup by Von Neumann’s min-max theorem, see [37]) and Blackwell’s ordering of information structures. Therefore, these results also serve as a strict generalization of Theorem 2.2 to standard Borel Bayesian Games. We note here that the required absolute continuity conditions always hold for finite or countable spaces (in that one can always find a reference measure with respect to which all probability measures on a countable space is absolutely continuous).

###### Definition 4.1

For fixed with , and fixed and compact , we define a class of games to be all games for which the players have compact action spaces and the cost function is bounded and continuous in players’ actions for every state .

###### Lemma 4.1

Given fixed and fixed and compact , for any information structure which satisfies Assumption 2.1 and any kernels :

 κmaxμ≲μandμ≲κminμ

over all games in .

Proof. Let us consider the first relation.

Take an arbitrary zero-sum game with cost function and action spaces and . Let be the Bayesian Nash equilibrium policies for the players under information structure and be the Bayesian Nash equilibrium policies under information structure . By our assumption on , these policies exist [Theorem 3.1]. Let be the measurement channel for player under information structure .

The expected value of the cost for the maximizer under the first information structure is:

 VκmaxμG(γ1,γ2)=∫X×Y1×Y2c(x,γ1(y1),γ2(y2))Q1(dy1|x)κmaxQ2(dy2|x)ζ(dx)

By definition, the equilibrium solution for under is given by the solution to the min-max problem:

 minθ1∈Γ1maxθ2∈Γ2Vκ