Fast Algorithms for Rank-1 Bimatrix Games

12/11/2018 ∙ by Bharat Adsul, et al. ∙ LSE 0

The rank of a bimatrix game is the matrix rank of the sum of the two payoff matrices. For a game of rank k, the set of its Nash equilibria is the intersection of a generically one-dimensional set of equilibria of parameterized games of rank k-1 with a hyperplane. We comprehensively analyze games of rank one. They are economonically more interesting than zero-sum games (which have rank zero), but are nearly as easy to solve. One equilibrium of a rank-1 game can be found in polynomial time. All equilibria of a rank-1 game can be found by path-following, which finds only one equilibrium of a bimatrix game. The number of equilibria of a rank-1 game may be exponential, but is polynomial in expectation when payoffs are slightly perturbed. We also present a new rank-preserving homeomorphism between bimatrix games and their equilibrium correspondence.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Non-cooperative games are basic economic models, with Nash equilibrium as the central solution concept. In order to be a useful solution, a Nash equilibrium must be found by some method (including any adjustment process). For larger games this requires computer algorithms. We consider bimatrix games (two-player games in strategic form), for which one equilibrium is found by the algorithm by Lemke and Howson (1964). Finding all equilibria is feasible only for a few dozen strategies per player due to the exponential number of possible mixed equilibrium strategies (Avis et al., 2010).

This paper is an in-depth study of rank-1 games, introduced in Kannan and Theobald (2010), where the sum of the two payoff matrices has matrix rank one. They generalize zero-sum games where that matrix rank is zero. Rank-1 games are economically more interesting than zero-sum games, by allowing a “multiplicative” interaction in addition to an arbitrary zero-sum component (discussed further in Section 9). Like general bimatrix games, they can have many disjoint equilibria. On the other hand, as we show, they are computationally tractable

: One equilibrium of a rank-1 game can be found quickly (in polynomial time), and finding all equilibria takes comparable time to finding a single equilibrium of a general bimatrix game. Large rank-1 games are therefore attractive as detailed models of interaction, on a similar scale to, but more general than, zero-sum games or linear programs. Rank-1 bimatrix games and their computational analysis should therefore become a new tool in economic modeling.

The computational complexity (required running time) of computing a Nash equilibrium of a game has received substantial interest in the last two decades. A computational problem is considered tractable if it can be solved in polynomial time. Savani and von Stengel (2006) showed that the algorithm by Lemke and Howson (1964) may have exponential running time. Their examples may be rare and comparable to exponential running times of the simplex algorithm for linear programming (Klee and Minty, 1972), which works well in practice. The path-following Lemke–Howson algorithm implies that finding an equilibrium of a bimatrix game belongs to the complexity class PPAD defined by Papadimitriou (1984). PPAD describes certain computational problems where the existence of a solution is known, and the problem is to find one explicit solution.111In contrast, the better known complexity class NP applies to “decision problems” which have a “yes” or “no” answer. Other problems in PPAD include the computation of an approximate Brouwer fixed point, and related problems in economics such as market equilibria (Vazirani and Yannakakis, 2011), including the computation of an approximate Nash equilibrium of a game with many players.222

In games with three or more players (unlike in two-player games), the mixed strategy probabilities in a Nash equilibrium may be irrational numbers. A suitable concept for such games is approximate Nash equilibrium, and finding an exact Nash equilibrium is an even harder computational problem

(Etessami and Yannakakis, 2010). A celebrated result is that all problems in PPAD can be reduced to finding a Nash equilibrium in a bimatrix game, which makes this problem “PPAD-complete” (Chen and Deng, 2006; Chen et al., 2009; Daskalakis et al., 2009). No polynomial-time algorithm for finding a Nash equilibrium of a general bimatrix game is known.

Kannan and Theobald (2010) introduced a hierarchy of bimatrix games based on their rank, defined as the matrix rank of the sum of the two payoff matrices, and described ways to find approximate Nash equilibria of games of fixed rank (with exponential running time in the intended accuracy). In the present paper, we prove that an exact Nash equilibrium of a rank-1 game can be found in polynomial time, even though (another new result) a rank-1 game may have exponentially many equilibria. Moreover, as has been proved separately (Mehta, 2014; Chen and Paparas, 2017), games of rank  are PPAD-hard and thus as computationally difficult as general bimatrix games. In the context of the “rank” hierarchy, rank-1 games are therefore the most complex type of games that are computationally tractable.

Section 2 states the notation and preliminary results used in this paper. Its main observation (Theorem 6) shows that the set of equilibria of a game of rank  is the intersection of a hyperplane with a set of equilibria of parameterized games of rank . When , these are parameterized zero-sum games whose equilibria are the solutions to a parameterized linear program (LP), for which we recall relevant results from Adler and Monteiro (1992) in Section 3. The intersection with the hyperplane gives rise to the polynomial-time binary search for one equilibrium of a rank-1 game, explained in Section 4. In Section 5, we describe completely the set of all Nash equilibria of a rank-1 game.333The enumeration method in Theobald (2009) also considers a parameterized LP, but does not provide or exploit our structural insights. Section 6 describes an example (which may be useful to consult in between) that illustrates our main results, and a second example that shows that binary search fails in general for games of rank 2 or higher. A construction of rank-1 games with exponentially many equilibria is shown in Section 7. In Section 8, we describe a variant of the structure theorem of Kohlberg and Mertens (1986) with a new homeomorphism between the space of bimatrix games and its equilibrium correspondence that preserves the sum of the payoff matrices, and hence the rank of the games. In the concluding Section 9, we present a tentative example of an economic model based on rank-1 games, and note some open questions.

A preliminary version of our work appeared in Adsul et al. (2011), and the result of Section 7 in von Stengel (2012). The mathematical development in the present paper is almost entirely new in all parts.

2 Bimatrix games and rank reduction

In this section we state our notation for bimatrix games and recall the “complementarity” characterization of Nash equilibria. For games of rank  (see Definition 3), our central observation is Theorem 6 which states that their set of Nash equilibria is the intersection of a set of equilibria of parameterized games of rank with a suitable hyperplane. In subsequent sections, we show how to exploit this property algorithmically when .

We use the following notation. Let and

be vectors with all components equal to 0 and 1, respectively, their dimension depending on the context. The transpose of a matrix

is written . All vectors are column vectors, so if then is an matrix and is the corresponding row vector in . In matrix products, scalars are treated like matrices. Inequalities like hold for all components. The components of a vector are .

For and , a hyperplane is of the form , and a halfspace of the form . A polyhedron is an intersection of finitely many halfspaces, and called a polytope if it is bounded. A face of a polyhedron is of the form where . It can be shown that any face of can be obtained by turning some of the inequalities that define into equalities (Schrijver, 1986, Section 8.3). If a face of consists of a single point, it is called a vertex of . If for sets , then for some is called the projection of on , also written as .

A bimatrix game is a pair of matrices where player 1 chooses a row and player 2 simultaneously chooses a column  with the corresponding entry of as payoff to player 1 and of to player 2. The sets and of mixed strategies of player 1 and player 2 are given by

(1)

For the mixed strategy pair , the expected payoffs to the two players are and , respectively. A best response of player 1 against maximizes his expected payoff , and a best response of player 2 against maximizes her expected payoff . A Nash equilibrium (NE) is a pair of mutual best responses. The following well-known characterization (Nash, 1951) states that is a best response to if and only if every pure strategy of player 1 that is played with positive probability gives maximum expected payoff to player 1, and similarly for player 2.

Lemma 1.

Let be an bimatrix game. Consider the polyhedra

(2)

Let . Then is a best response to if and only if and for all rows

(3)

and is a best response to if and only if and for all columns

(4)

If both conditions hold, then and are the unique payoffs to player 1 and 2 in the Nash equilibrium .

The following lemma states the well-known fact that the equilibria of a bimatrix game are unchanged when subtracting a separate constant from each column  of the row player’s payoff matrix.

Lemma 2.

Call two games strategically equivalent if they have the same Nash equilibria. If , then the game is strategically equivalent to the game .

Proof.

This holds by Lemma 1, because the equilibrium payoff to player 1 in the game changes to in : Clearly, is equivalent to , and is equivalent to .     

The support of a mixed strategy is the set of pure strategies that are played with positive probability. A bimatrix game is degenerate if there is a mixed strategy that has more pure best responses than the size of its support (von Stengel, 2002). Among all games, degenerate games form a set of measure zero, so a “generic” game is nondegenerate. Most of our results hold for general games that may be degenerate.

The object of study of our paper are bimatrix games of fixed rank, introduced by Kannan and Theobald (2010), which generalize zero-sum games which are games of rank zero.

Definition 3.

The rank of a bimatrix game is the matrix rank of .

Lemma 4.

An bimatrix game of positive rank can be written as for suitable , , and a game of rank .

Proof.

This is due to the well-known result that an matrix is of rank at most if and only if it can be written as the sum of rank-1 matrices, that is, as for suitable and for . This is easily seen by writing the th column of the matrix as and letting (see also Wardlaw (2005)).     

The following is a simple but central observation.

Lemma 5.

Let , , , , , . The following are equivalent:

(a) is an equilibrium of ,

(b) is an equilibrium of and ,

(c) is an equilibrium of and .

Proof.

The equivalence of (a) and (b) holds because the players get in both games the same expected payoffs for their pure strategies: this is immediate for player 1, and if , then the column payoffs are given by

(5)

The games in (b) and (c) are strategically equivalent by Lemma 2.     

Consider a game of positive rank where so that is a game of rank according to Lemma 4. Then the game in Lemma 5(c) has the same sum of its payoff matrices and hence also rank , for any choice of the parameter . Let be the set of Nash equilibria together with of these parameterized games,

(6)

where by Lemma 5(b)

(7)

These considerations imply the following main result of this section.

Theorem 6.

Given a bimatrix game , its set of Nash equilibria is exactly the projection on of the intersection of and the hyperplane defined by

(8)

Theorem 6 asserts that for any rank- game of the form , every Nash equilibrium of the game is captured by the set in (6) of games of rank which are parameterized by , intersected with the hyperplane in (8). Can this rank reduction be leveraged to get an efficient algorithm to find a Nash equilibrium for a game of arbitrary constant rank? As will be discussed in Section 6, this does not work in general. However, it does work for rank-1 games.

3 Parameterized linear programs

Our aim is to describe the equilibria of rank-1 games using the rank reduction of the previous section. For this, we consider the set in (7) for ,

(9)

By Lemma 2, this is the set of equilibria of zero-sum games parameterized by . These correspond to the solutions of a parameterized linear program (LP). In this section, we review the structure of such parameterized LPs with a particular view towards nongeneric cases and polynomial-time algorithms as studied by Adler and Monteiro (1992). In essence, such parameterized LPs have finitely many special values of the parameter called breakpoints. These separate the set into a connected sequence of polyhedral segments (which generically are line segments). They are described in Theorem 16 in the next section, where we will present a polynomial-time algorithm for finding one equilibrium of a rank-1 game. In the subsequent section we present another algorithm for finding all equilibria.

We assume familiarity with notions of linear programming such as LP duality and complementary slackness; see, for example, Schrijver (1986). The following well-known lemma (Dantzig, 1963, p. 286) states that the equilibria of a zero-sum game are the primal and dual solutions to an LP.

Lemma 7.

Consider an zero-sum game . In any equilibrium of this game, is a minmax strategy of player 2, which is a solution to the LP with variables in and in :

(10)

and is a maxmin strategy of player 1, which is a solution to the dual LP to . For the optimal value of  in , the maxmin payoff to player 1 and minmax cost to player 2 and hence value of the game is .

Proof.

The dual LP to (10) has variables and and states

(11)

Both LPs are feasible (with sufficiently small and large ). Let be an optimal solution to (10) and to (11). Then by LP duality, and (10) and (11) state , that is, player 2 pays no more than for any row, and , that is, player 1 gets at least in every column, where which is therefore the value of the game, and is a Nash equilibrium.     

Applied to , the LP (10) says:

(12)

The substitution gives the equivalent LP

(13)

Throughout, we stay close to the notation in Adler and Monteiro (1992). We write the LP (13) as

(14)

with the polyhedron

(15)

The LP has a parameterized objective function over the fixed polyhedron . It is the dual of the following LP which has a fixed objective function with a parameterized right hand side, where we use slack variables to express the inequality as an equality:

(16)

For optimal solutions to and to we have . The next lemma states that and can be interpreted as the player’s payoffs for the games in Lemma 5(a) and (b).

Lemma 8.

Let . Then is an equilibrium of the game if and only if is an optimal solution to in for some which is uniquely determined by , and is an optimal solution to in for some and which are uniquely determined by and . The equilibrium payoffs are to player 1 and to player 2. If , these are also the payoffs in the game , and is an equilibrium of that game.

Proof.

By Lemma 5 with , the game has the same equilibria and, by (5), payoffs as the game if . Consider any optimal solutions to and to . Then states for each row  of the inequality . Complementary slackness, equivalent to LP optimality, states that whenever . This is the equilibrium condition in (3) that states that is a best response to . Because it holds for at least one , it uniquely determines , which is the equilibrium payoff to player 1 in the above games.

Similarly, the constraint in (16) means that is determined by , and states for all , or equivalently . Complementary slackness, equivalent to LP optimality, states that this inequality is tight whenever . This is the condition (4) that states that is a best response to in the game , and it uniquely determines as the equilibrium payoff to player 2.     

Primal-dual pairs of LPs with a parameter have been studied since Gass and Saaty (1955). The next result is well known, which we show following Jansen et al. (1997).

Lemma 9.

For , let be the optimum value of  and hence of . Then is the pointwise maximum of a finite number of affine functions on and therefore piecewise linear and convex.

Proof.

The optimum of exists for any and is taken at a vertex of the polyhedron  in (15). Let be the set of vertices of , which is finite. Hence

(17)

where for each of the finitely many in the function is affine. Hence is the pointwise maximum of a finite number of affine functions as claimed. The epigraph of given by is the intersection of the convex epigraphs of these affine functions, so is convex and is a convex function.     

By (17), the function is the “upper envelope” of the affine functions defined by the vertices of . A breakpoint is any so that has different left and right derivatives when approaches from below or above, denoted by and .

For any LP , say, let be the face of the domain of where its optimum is attained. For any we denote by , that is,

(18)

Then the left and right derivatives of at are characterized as follows (obvious from (17), also Prop. 2.4 of Adler and Monteiro (1992)):

(19)

which are the optima of the two LPs

(20)

That is, is a breakpoint if and only if . Clearly, in that case there are at least two vertices and of  that define two different affine functions and that meet at to define the maximum in (17). These are also vertices of , which is then a higher-dimensional face (such as an edge) of . The following central observation shows that the breakpoints give all the information about the optimal faces of for any  between these breakpoints.

Theorem 10.

(Adler and Monteiro, 1992, Theorem 4.1)  Let be the breakpoints, in increasing order, for the parameterized LPs and , and let and . For , consider any . Then for , and for .

The nondegenerate case is straightforward, where is a vertex of unless is a breakpoint, in which case is an edge of . Then these vertices uniquely describe the pieces of the piecewise linear function . An example is shown in the right diagram of Figure 2 below with the constraints (42) for in , with the additional constraints to represent , and objective function given by . The three linear parts of are

(21)

which correspond to the optimal vertices of given by , , and . The two breakpoints are and which correspond to the two edges of .

In the degenerate case, one typically does not get polynomial-time algorithms by considering vertices and corresponding basic solutions to the LP . Instead of partitioning the variables of into basic and nonbasic variables, Adler and Monteiro (1992) consider “optimal partitions”; we use here only the partition part that replaces the nonbasic variables, which we denote by in Definition 12 below (called in Adler and Monteiro (1992)). This is the set of variables of the dual LP that may be strictly positive in an optimal solution, which represent the “true inequalities” of .

Definition 11.

For some suppose that the constraints in

(22)

are feasible. Then any row of so that for some feasible is called a true inequality of .

If there are solutions and to (22) so that and then both inequalities are true for , so there is a unique largest set of true inequalities with some feasible solution where all these strict inequalities hold simultaneously.

Definition 12.

Let and . Let be the set of true inequalities of the optimal face of in , that is,

(23)

Any non-true inequality of is always tight, that is, if and if . It can be shown that for such and there are optimal solutions to where and , so these are the true inequalities of . This is also known as “strict complementary slackness” (Schrijver, 1986, Section 7.9). Consider the polyhedron

(24)

The following lemma considers the face of defined by the equations for and for , which are necessary and sufficient for a feasible solution to to be optimal. This is immediate from the standard complementary slackness condition.

Lemma 13.

Let and . For and , with and , define

(25)

Then any feasible solution to is optimal if and only if .

Crucially, according to Theorem 10, for any in an open interval (for ) the optimal face is constant in . Hence, the true inequalities of are also equal to some fixed for all . With the LPs

(26)

the following holds.

Lemma 14.

Consider and for as in Theorem 10. Let and (which do not depend on the choice of ). Then for ,

(a) the breakpoint is the optimum of the LP and of the LP ;

(b) if then  .

Proof.

See Adler and Monteiro (1992), p. 171 for (a), and Theorem 3.1(a) and Lemma 3.1(b) for (b).     

Lemma 14(a) implies that for any in the open interval , for , the endpoints of the closed interval are given by the minimum and maximum of for where and . Lemma 14(b) and Lemma 13 imply that if is itself a breakpoint, then .

As we will describe in detail in the next section, Theorem 10 and Lemma 14 lead to a description of the set of optimal solutions to and for all with the help of the breakpoints in the form of polyhedral segments (which are lines in the nondegenerate case). Any optimal solution to belongs to , which is a face of , and any optimal solution to belongs to , which is a face of . For between two breakpoints, these faces do not change (but typically varies with ), and their Cartesian product defines of the segments. If is equal to a breakpoint, the set is a subset of the two adjoining faces for near , whereas is a superset of the adjoining faces , as described in Theorem 10. This defines the other segments. Using this we will give a precise description of the set in Theorem 16 below.

Adler and Monteiro (1992) describe how to generate the breakpoints of in polynomial time per breakpoint, with a polynomial-time algorithm applied to the LPs (14), (20), (26), which we will adapt to our purpose. (However, the number of breakpoints may be exponential, see Murty (1980).) The true inequalities in Definition 11 can also be found with an LP, according to the following lemma (Prop. 4.1 of Adler and Monteiro (1992)), due to Freund et al. (1985); for an alternative polynomial-time algorithm see Mehrotra and Ye (1993).

Lemma 15.

For and the constraints consider the LP

(27)

Then is feasible if and only if is feasible and bounded, and any optimal solution to satisfies (and otherwise) if and only if is a true inequality of . For such a solution to , is a solution to where for all true inequalities .

Proof.

If the LP is feasible then it is also bounded because . Let be the set of true inequalities of . Choose so that for all . Then for . Hence, and defined by if , and otherwise, give a feasible solution to the LP (27). This solution is also optimal because any solution to (27) where would give a solution to (22) with and thus , so for any feasible solution to (27) we have whenever . This proves the claim.     

4 Finding one equilibrium of a rank-1 game by binary search

We use the results of the previous section to present a polynomial-time algorithm for finding one equilibrium of a rank-1 game , using binary search for a suitable value of the parameter in Theorem 6. The search maintains a pair of successively closer parameter values and corresponding equilibria of the game that are on opposite sides of the hyperplane in (8). Generically, the set in (9) is a piecewise linear path which has to intersect between these two parameter values. In general, the segments of that “path” are products of certain faces of the polyhedra in (14) and in (24) described in Theorem 10 and Lemma 14 using the breakpoints of the LPs and .

We give a complete description of in terms of these faces of and , which we project to (for the possible values of ) and . Namely, consider and for as in Theorem 10. For , define

(28)

Note that for any (for any ) the components and are uniquely determined by by Lemma 8. Similarly, let

(29)

where again in is uniquely determined by . Recall that the choice of does not matter for the definitions of and . The polyhedra for (which for and are infinite, otherwise bounded) represent of the segments that constitute between any two breakpoints and . They are successively connected by further segments, which are polytopes that correspond to the breakpoints themselves. These are for defined by

(30)

and

(31)
Theorem 16.

The set in is given by

(32)

where for we have

(33)

and

(34)
Proof.

This follows from Lemma 8, Lemma 13, and Theorem 10. By Theorem 10, is the optimal face of which is a subset of . Hence , and similarly , which implies (33). In addition, we have and and thus because of the additional tight constraints in . Similarly, . This shows (34).     

The preceding characterization of is used in the following lemma.

Lemma 17.

Let and and so that for in

(35)

Then for some with .

Proof.

Consider the largest so that and there are with and , which exists since fulfills this property and is closed by Theorem 16.

If then both and belong to the same set or which is convex, where since and we have for a suitable convex combination of and , and , as claimed.

Hence we can assume . Suppose is a breakpoint , so that . Consider and where by maximality of . By (34), we have and hence . Because and , a suitable convex combination of and belongs to and fulfills as claimed (in fact, does by maximality of ). If is not a breakpoint, we directly have