# Exponential Convergence of Gradient Methods in Concave Network Zero-sum Games

Motivated by Generative Adversarial Networks, we study the computation of Nash equilibrium in concave network zero-sum games (NZSGs), a multiplayer generalization of two-player zero-sum games first proposed with linear payoffs. Extending previous results, we show that various game theoretic properties of convex-concave two-player zero-sum games are preserved in this generalization. We then generalize last iterate convergence results obtained previously in two-player zero-sum games. We analyze convergence rates when players update their strategies using Gradient Ascent, and its variant, Optimistic Gradient Ascent, showing last iterate convergence in three settings – when the payoffs of players are linear, strongly concave and Lipschitz, and strongly concave and smooth. We provide experimental results that support these theoretical findings.

## Authors

• 1 publication
• 7 publications
• ### Solving Min-Max Optimization with Hidden Structure via Gradient Descent Ascent

Many recent AI architectures are inspired by zero-sum games, however, th...
01/13/2021 ∙ by Lampros Flokas, et al. ∙ 0

We introduce a new algorithm for the numerical computation of Nash equil...
05/28/2019 ∙ by Florian Schäfer, et al. ∙ 0

• ### Linear Last-iterate Convergence for Matrix Games and Stochastic Games

06/16/2020 ∙ by Chung-Wei Lee, et al. ∙ 0

• ### Extra-gradient with player sampling for provable fast convergence in n-player games

Data-driven model training is increasingly relying on finding Nash equil...
05/29/2019 ∙ by Samy Jelassi, et al. ∙ 0

• ### Smooth markets: A basic mechanism for organizing gradient-based learners

With the success of modern machine learning, it is becoming increasingly...
01/14/2020 ∙ by David Balduzzi, et al. ∙ 10

• ### Provably Efficient Policy Gradient Methods for Two-Player Zero-Sum Markov Games

Policy gradient methods are widely used in solving two-player zero-sum g...
02/17/2021 ∙ by Yulai Zhao, et al. ∙ 11

• ### Learning to Resolve Alliance Dilemmas in Many-Player Zero-Sum Games

Zero-sum games have long guided artificial intelligence research, since ...
02/27/2020 ∙ by Edward Hughes, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Connections between game theory and learning had long been known, before interest resurged recently in the machine learning community, largely due to the success of Generative Adversarial Networks (GANs), a novel framework for learning generative models

[16]

. A GAN is formulated as a two-player zero-sum game between two neural networks, a generator and a discriminator. The generator attempts to fool the discriminator by mapping random noise to images that look similar to samples from a target distribution, while the discriminator learns to distinguish the generator’s output from real samples from the target distribution. Theoretically, at the equilibrium of this game, the generator outputs the target distribution. In practice, GANs produce promising results on a number of tasks including image generation, semantic segmentation, and text-to-image synthesis

[15].

Among many theoretical questions opened up by GANs, that of last iterate convergence has attracted much attention and seen exciting progress. Classical results show that, when players use no-regret online learning algorithms to play a two-player zero-sum game, the time average of their strategies converge to a Nash equilibrium — a point where neither player can make gains by unilaterally deviating from their current strategy. In GANs, strategies correspond to parameters of neural networks; averaging strategies makes little sense. It is therefore desirable that the players’ strategies, from iteration to iteration, should converge to an equilibrium. This is known as last iterate convergence, which is not implied by classical results. A number of simple algorithms have been shown to give rise to such convergence in two-player zero-sum games, with exponential convergence rates in various settings [14, 18, 21] (see Section 1.1 for more details).

On the other hand, many recently proposed extensions of GANs go beyond the two-player zero-sum framework, either to address challenges faced by the original GAN, or to make it more versatile. In particular, many models introduce more agents (neural networks) to the game. For example, Hoang et al. [17] proposed using an ensemble of generators to address mode collapse

, a common problem of the classical GAN, where the generator captures only one or few modes of the data distribution. Other architectures incorporate a third classifying network, which is in direct competition with either the generator

[22] or the discriminator [8]

; such architectures are often built for semi-supervised learning. Lastly, some architectures incorporate an additional encoding network which, like the generator, competes with the discriminator, and allows for sampling from a latent distribution that encodes additional information about the data distribution

[4, 7, 12]. Results on two-player zero-sum games do not apply to these architectures with more than two agents. It is also well known that two-player zero-sum games have many properties not extensible to games with more players or non zero-sum payoffs.

We observe that the extensions above all give rise to network zero-sum games (NZSGs), a class of games first proposed and studied by Cai et al. [5]. An NZSG is structured by a graph, where each node corresponds to a player, and along each edge a game is played between its two node players. A player chooses one strategy to be used in all the games in which which she is engaged; the sum of all players’ payoffs is always zero. Since players cannot choose different strategies for different games, an NZSG is not a simple parallelization of multiple two-player zero-sum games. However, Cai et al. [5]

showed that NZSGs with linear payoffs preserve certain properties from two-player zero-sum games. In particular a Nash in an NZSG can be computed via a linear program.

We first generalize results of [5] on the tractability of equilibrium for NZSGs (Section 2); we show that in an NZSG with concave payoffs, a Nash can be computed via no-regret learning. Then, as our main result, we show last iterate convergence results for NZSGs with several classes of payoffs (Section 3), when players adopt simple learning rules used in practice, such as Gradient Ascent (GA) and Optimistic Gradient Ascent (OGA). GA is the most ubiquitous optimization algorithm. It may be seen as a smoothed best response, and so it may not be surprising that it produces dynamics that diverge from the equilibrium in two-player zero-sum games with linear payoffs [11, 18]. We show that this phenomenon persists in NZSGs with linear payoffs. OGA, on the other hand, incorporates some minimal memory, and uses information from one step before. This small tweak has been shown to induce last iterate convergence in two-player zero-sum games with either linear payoffs or strongly concave payoffs that are smooth in various senses [9, 18, 21]. We extend these to NZSGs, showing comparable convergence performance. For two-player zero-sum games with strongly concave payoffs, GA is known to induce last iterate convergence; we generalize this as well.

We use two sets of tools. Our main tool for NZSGs with linear payoffs is dynamical systems. Strategies played in a repeated game give rise to a dynamical system; techniques for analyzing such systems naturally can be used to analyze various update algorithms [10, 11, 18]. Our results on both the divergence of GA and convergence of OGA dynamics are built on linear algebraic techniques used to analyze the corresponding dynamical systems. Crucial to the arguments is an algebraic property we show for NZSGs; namely, that a Hessian matrix associated with the payoff functions is antisymmetric everywhere.

We use Lyapunov-style convergence proofs to show results in NZSGs with strongly concave and smooth payoffs. Apart from existing arguments for two-player zero-sum games, our proof exploits a structural lemma (Lemma 1), which may be of independent interest.

In Section 4, we provide experiments that validate our theoretical findings.

### 1.1 Related Work

Cai et al. [5] introduced the class of network zero-sum games, and showed that a Nash equilibrium of an NZSG can be computed by a linear programming when each player’s strategy is a distribution over a finite number of actions.

A few papers study convergence in -player games. The most closely related work to ours is Azizian et al. [1]. They show that various gradient-based algorithms, including OGA, converge at an exponential rate to the Nash in a class of smooth and “monotone” -player games. With slight modification explained in the technical sections, our results on the OGA dynamics in NZSGs with strongly concave and smooth payoffs or with linear payoffs can be obtained by showing these games to be smooth and monotone. Our proofs in these settings may be viewed as alternative approaches to showing these results. An advantage of our approach is that it is readily modified to apply for games with Lipschitz payoffs, as we demonstrate in Section 3.3.

Balduzzi et al. [2] study two classes of -player games, Hamiltonian games and potential games, both of which are specific instances of NZSGs. They show that, when players use a continuous-time version of GA to update their strategies in a Hamiltonian game, the dynamics circle perpetually around the Nash of the game. They propose Symplectic Gradient Adjustment (SGA) and show it to converge in last iterate for both Hamiltonian and potential games. Balduzzi et al. [3] study another class of games called Smooth Market Games, which consist of payoffs that are pairwise zero-sum. They show that a continuous time version of GA converges in last iterate to the Nash of a game when payoffs are strictly concave in players’ strategies.

A number of papers study last iterate convergence in concave two-player zero-sum games. Liang and Stokes [18] use tools from dynamical systems to show exponential convergence of the last iterate in bilinear games when players use OGA. They also show exponential convergence of the last iterate in games with smooth and strongly concave payoffs when players use GA. Mokhtari et al. [21] show exponential convergence of the last iterate in games with bilinear, or smooth and strongly concave payoffs when the players use OGA, by interpreting OGA as an approximation of the Proximal Point Method. Gidel et al. [14] use a variational inequality perspective to show exponential convergence of a variant of OGA in constrained two-player zero-sum games with smooth and strongly concave payoffs. Merkitopolous et al. [20] use similar tools to show last iterate, but not exponential convergence for Mirror Descent and Optimistic Mirror Descent when payoffs are strongly concave, and for Optimistic Mirror Descent when payoffs are linear.

### 1.2 Notations and Mathematical Conventions

Vectors in are denoted by boldface, and scalars by lowercase. Time indices are denoted by superscripts, while players are identified by subscripts. For a square matrix

we denote the set of its eigenvalues by

. denotes the identity matrix.

###### Definition 1

Given , and concave function . is a supergradient of at if , . The set of supergradients of at a point is denoted by .

###### Definition 2

For , a function is -strongly concave if and ,

 f(u′)≤f(u)+⟨q,u′−u⟩−α2∥u−u′∥2.

A function is -strongly concave in  if for any , is -strongly concave.

## 2 Network Zero-sum Games Basics

In this section we extend network zero-sum games as defined by Cai et al. [5] to allow continuous action spaces. We then show that in games with concave payoff functions, an equilibrium can be efficiently computed with no-regret dynamics.

###### Definition 3

A network game consists of the following:

• a finite set of players, and a set of edges which are unordered pairs of players ;

• for each player , a convex set , the strategy set for player ;

• for each edge , a two-person game , where , and .

Given a strategy profile , player ’s payoff is .

A network game is a network zero-sum game (NZSG) if for all strategy profiles , .

We let denote , and . Two-player zero-sum games are special cases of NZSGs, where has two nodes, connected by one edge.

In a concave NZSG, each is concave in . An NZSG is linear if each is linear in both and .

Let denote the strategy profile without player ’s strategy, i.e., .

###### Definition 4

A strategy profile is a Nash equillibrium for an NZSG if for each player , for any strategy , .

It can be shown via a fixed point argument that, in a concave NZSG where each player’s strategy space is convex and compact, a Nash equilibrium always exists [19]. Cai et al. [5] showed that for linear NZSGs where each player’s strategy set is a simplex, a Nash can be computed efficiently by a linear program.

As a warm-up, we show that another classical technique for computing equilibrium in two-player zero-sum games, namely, no-regret learning algorithms, can be used to find an approximate Nash in general concave NZSGs.

Given an NZSG with compact strategy sets, consider the players playing it repeatedly. Let and denote, respectively, player ’s and the other players’ strategies at time step  of the game. Each player should only respond to the past strategies of her opponents; i.e., may depend only on .

###### Definition 5

In a repeated game, a player’s regret at time , , is

 ri(t)=maxxi∈Xit∑s=1[pi(xi,xs−i)−pi(xsi,xs−i)].

A player ’s strategy has no-regret if for all , for some as . An algorithm that produces no-regret strategies is a no-regret algorithm.

It is well known that efficient no-regret algorithms exist [6], and that in a two-player zero-sum game, if players use no-regret dynamics, the time average of their strategies converges to a Nash equilibrium [6]. We show this phenomenon generalizes to NZSGs with concave payoffs.

###### Proposition 1 ()

In a concave NZSG with compact strategy sets, if each player uses strategies that have no-regret, then the strategy profile where each player plays her time-average strategy converges to a Nash equilibrium.

A key step in the proof of Proposition 1 is the following property of NZSGs. We will make repeated use of this property later in the paper.

###### Lemma 1 ()

In an NZSG, for any two strategy profiles and , we have

 ∑ipi(xi,x∗−i)=−∑ipi(x∗i,x−i).

As we discussed in the Introduction, Proposition 1 is not adequate for applications where strategies are parameters of neural networks, since taking averages over strategies makes little sense in such settings. Following much recent literature, we shift the focus to last iterate convergence.

## 3 Last Iterate Convergence in NZSGs

In this section we present our main results on last-iterate convergence in NZSGs when players use gradient style updates. In this section we assume that the strategy spaces are unconstrained, i.e., for each .

We first formally define the two update rules we focus on. Recall that we use to denote player ’s strategy at time . A player using Gradient Ascent (GA) modifies her strategy by

 xt+1i=xti+η∇xipi(xt), (GA)

where is a fixed step size. A player using Optimistic Gradient Ascent (OGA) updates her strategy by

 xt+1i=xti+2η∇xipi(xt)−η∇xipi(xt−1), (OGA)

where again is a fixed step size.

### 3.1 Linear NZSGs

Even in a two-player zero-sum bilinear game, i.e., , where is a matrix, if each player uses GA, over time the players’ strategies diverge from the set of Nash [18]. If, instead, players use OGA, their strategies converge to a Nash of the game [9, 18, 20]. We show that these phenomena continue to hold for linear NZSGs.

To state the rates of convergence and divergence, we need to introduce a matrix  for a linear NZSG, which we motivate later. Given an NZSG and a strategy profile , the Hessian is a block matrix with the block given by

 Hij(x)=∇2xj,xipi(x).

Denote the smallest nonzero modulus of an eigenvalue of by , and denote the largest modulus of an eigenvalue of  by . Denote the distance to a set by .

###### Theorem 3.1 ()

Consider an unconstrained, linear NZSG. Let denote the set of Nash of the game. Assume each player uses GA to update her strategy at each time step. Assume , for some . Then at each time step ,

 d(xt,X∗)2≥(1+η2ω(H)2)tR2.
###### Theorem 3.2 ()

Consider an unconstrained, linear NZSG. Assume that each player uses OGA as her update rule. Let denote the set of Nash of the game. If is diagonalizable for all , and if for some . Then setting , at each time step ,

 d(xt+1,X∗)2≤⎛⎜ ⎜⎝12+12⎛⎝1−(ω(H)ρ(H))2⎞⎠12⎞⎟ ⎟⎠tr2.

We sketch the proof ideas and relegate details to the supplementary file. We formulate the behavior of GA and OGA as trajectories of dynamical systems; this view has been taken in several previous works, which also analyze the behaviors of updating algorithms using tools from dynamical systems [10, 11, 18].

###### Definition 6

A relation of the form , also written as , is a discrete time dynamical system with update rule . A point is a fixed point of if .

If players use GA, the strategies evolve according to the dynamical system , where is the Hessian matrix defined above. It is not hard to show that the set of Nash equilibria is precisely the set of fixed points of this dynamical system. Note that, when

is a linear function, as is the case for the GA dynamics, a point is its fixed point if and only if it is in the eigenspace of

for eigenvalue .

For a dynamical system with update rule , the Jacobian is the matrix with its entry . The eigenvalues of the Jacobian  at a fixed point  describe the behavior of the dynamics around  . Roughly speaking, if all eigenvalues of  have modulus greater than , then in a neighborhood around , the dynamics diverges from ; conversely, if all eigenvalues of  have modulus smaller than , in a neighborhood of  the dynamics converges to . When is linear, this characterization of convergence/divergence extends to the entire space (beyond neighborhoods around ), and allows some eigenvalues to be .

###### Proposition 2 ()

Let denote the set of fixed points of a dynamical system with linear update rule: , where J is diagonalizable. Let denote .

1. If either or , then letting denote the largest modulus of any eigenvalue of not equal to 1, , .

2. If either or , then letting denote the smallest modulus of any eigenvalue of not equal to 1, .

To show Theorem 3.1, therefore, it suffices to analyze the eigenvalues of the matrix . The crucial observation is that, for NZSGs, the Hessian  is an antisymmetric matrix of the form

 H=⎡⎢ ⎢ ⎢ ⎢ ⎢⎣0C12…C1n−C⊤120…C2n⋮⋮⋱⋮−C⊤1n−C⊤2n…0⎤⎥ ⎥ ⎥ ⎥ ⎥⎦.

This is a consequence of the following lemma on NZSGs in general:

###### Lemma 2 ()

In an NZSG, if each has continuous second partial derivatives, then

 ∇2xi,xjpji(x)=−(∇2xj,xipij(x))⊤.

As a result, for the GA dynamics in a linear NZSG, all eigenvalues of  are imaginary, and therefore all the eigenvalues of are of the form for some . Part (b) of Proposition 2 indicates a diverging dynamics.

The antisymmetry of the Hessian  is also a crucial step in the proof of Theorem 3.2. We first need to augment the state space to allow the memory from a previous step to be passed as part of the state. Following Daskalakis and Panageas [11], we consider a dynamical system with the following update rule , defining :

 g(x,x′)=(g1(x,x′),g2(x,x′)), (1) g1i(x,x′)=xi+2η∇xi^pi(x,x′)−η∇x′i^pi(x,x′), g2i(x,x′)=xi.

More explicitly, for the OGA update rule, we have the relation . We make use of a connection established by [11] between the GA dynamics and the OGA dynamics (Proposition 3). Besides another application of the antisymmetry of , we also use an expression for the determinant of a block matrix (Lemma 3).

###### Proposition 3 ([11])

Let be a fixed point of the GA dynamics. Then, is a fixed point of the OGA dynamics, and for each we have two eigenvalues in that are the roots of the quadratic equation

 λ2−(2μ−1)λ+(μ−1)=0.
###### Lemma 3 ([13])

Let be a block matrix of the following form

 A=[M1M2M3M4],

where each is a square matrix, and is invertible. Then the determinant of is equal to the determinant of its Schur Complement:

 det(A)=det(M1−M2(M4)−1M3)det(M4).

In order to be able to apply Proposition 2, we make an additional diagonalizability assumption on . This is not a restrictive assumption; for any linear function, there is an arbitrarily small perturbation that makes its Hessian diagonalizable; in fact, the set of nondiagonalizable matrices over has Lebesgue measure 0. In comparison, Azizian et al. [1] show exponential convergence of OGA in linear games with the assumption that the Hessian is invertible.

### 3.2 Smooth and Strongly Concave Payoffs

A payoff function is said to be -smooth, for , if for all ,

 ∥∇xipij(xi,xj)−∇xipi(x′i,xj)∥≤β∥xi−x′i∥; (2) ∥∇xipij(xi,xj)−∇xipi(xi,x′j)∥≤β∥xj−x′j∥.

An NZSG is said to have -smooth payoffs if each payoff function is -smooth for every . The game is said to be have -strongly concave payoffs if each is -strongly concave in . In this section, we show that when players use GA and OGA to update their strategies in a game with payoffs that are -strongly concave and -smooth, their strategies converge to a Nash at an exponential rate. Throughout this section, we assume that for each player , is twice continuously differentiable. Since each is differentiable, it has a unique supergradient, at a point .

Before stating our main results, we remark on the existence and uniqueness of Nash. Since we consider unconstrained NZSGs, Proposition 1 does not apply. Unlike linear NZSGs in Section 3.1, where is always a Nash, in general, Nash may not exist when the strategy spaces are not compact. With -strong concavity, however, we do get uniqueness of Nash when one exists.

###### Lemma 4 ()

In an NZSG with -strongly concave payoffs for , if a Nash equilibrium exists, it is unique.

For applications such as GANs, where strategies are parameters of neural networks, strategy spaces are practically compact, and a Nash equilibrium is guaranteed by Proposition 1 to exist.

We now state the main results of this section.

###### Theorem 3.3 ()

Consider an unconstrained NZSG with payoffs that are twice continuously differentiable, -strongly concave and -smooth for . Assume the existence of a Nash, . Let be such that for . If each player uses GA, with , then at each time step ,

 ∑i∥xti−x∗i∥2≤(1−α24nβ2)tnr2.
###### Theorem 3.4 ()

Consider an unconstrained NZSG with payoff functions that are twice continuously differentiable, -strongly concave and -smooth, for . Assume the existence of a Nash, . Let be such that for . If each player uses OGA, with , then at each time step ,

 ∑i∥xt+1i−x∗i∥2≤(1−α4nβ)t(n+1)2r2.

In order show convergence for GA, we use a Lyapunov-style convergence argument. For two-player zero-sum games with strongly-concave and smooth payoffs, Liang and Stokes [18] show that, when players use GA to update their strategies, the strategies converge to the Nash of the game at an exponential rate. The key that allows us to extend the result to NZSGs is Lemma 1, which causes terms that are introduced by the strong concavity condition to vanish.

For the OGA update rule, we make use of writing OGA as a two step update, so that the second iterate results in a GA style update,

 wti =wt−1i+η∇xipi(xt), (OGA′) xt+1i =wti+η∇xipi(xt).

Plugging in for in terms of gives us the original OGA update.

Mokhtari et al. [21] show that in a two-player zero-sum game with smooth and strongly concave payoffs, if each player uses the OGA update, the strategies converge to a Nash exponentially fast. Lemma 1 again plays a key role in our extension of the result to network zero-sum games.

Azizian et al. [1] show exponential convergence to a Nash when players use the OGA update strategy in a game with smooth payoffs and “strongly monotone” dynamics. We show in the supplementary file that NZSGs with strongly concave payoffs are in fact strongly monotone; this constitutes an alternative derivation of exponential convergence of the OGA dynamics.

### 3.3 Lipschitz and Strongly Concave Payoffs

In this section, we show that if players use GA or OGA to update their strategies in an NZSG where payoffs are -strongly concave and -Lipschitz, for , then, given appropriate step sizes, their strategies converge to the unique Nash of the game. We assume that for each player , is continuously differentiable. If each is -Lipschitz, then for each player ,

 ∀x∈X,∥∇xipi(x)∥≤L. (3)
###### Theorem 3.5 ()

Consider an unconstrained NZSG that is played for rounds. Assume each is -strongly concave in and -Lipschitz for . Assume the existence of a Nash, . Let be such that, for each player , for . If each player uses GA with variable step size at each time step , then at each time step ,

 ∑i∥xti−x∗i∥2≤L2nt∑s=1η2s+nr2t∏s=1(1−ηsα).

In particular, if for , then

 limT→∞∑i∥xTi−x∗i∥2=0.
###### Theorem 3.6 ()

Consider an unconstrained NZSG that is played for rounds. Assume each is -strongly concave in and -Lipschitz for . Assume the existence of a Nash, . Let be such that for each player , for . Then if each player uses OGA with nonincreasing step size ,

 ∑i∥xti−x∗i∥2≤4nL2t∑s=1ηsηs−1+nr2t∏s=1(1−(ηs+ηs−1)α).

In particular, if for , then

 limT→∞∑i∥xTi−x∗i∥2=0.

Our proofs for these theorems resemble those from Section 3.2, with Lemma 1 facilitating the generalization to NZSGs. We note that the proof fails to achieve exponential convergence, due to the lack of smoothness in the game. Furthermore, the algorithm designer needs to know in advance the time horizon , the number of time steps the game is to be played, in order to choose a learning schedule that allows for guaranteed last-iterate convergence.

## 4 Experiments

In this section, we provide examples validating our results. We first show convergence in the simplest setting — a game with three players where a zero-sum game is played between each pair. We provide experiments showing convergence in a game with linear payoffs, and a game with smooth and strongly concave payoffs. We then provide an experiment showing the effect that increasing the number of players has on convergence rate. For each experiment, we show the performance of both GA and OGA.

### 4.1 Three Player Game with Linear Payoffs

We provide experiments validating our theoretical results for a three player game with linear payoffs. The payoff of the players can be expressed as

To track convergence, it is convenient if the game has a unique Nash. The linear game will have a unique Nash at as long as the fixed point of dynamics is the singleton . This will occur if the Hessian of payoffs, , has no eigenvalues equal to 0. Since is antisymmetric, its eigenvalues come in complex pairs, and if is even dimension, will have an eigenvalue equal to 0 if its determinant is 0. If we sample entries of the

’s i.i.d from the uniform distribution this will happen with probability 0.

We let , and initialize the entries by sampling i.i.d. from the uniform distribution on . We initialize the coordinates of i.i.d from the uniform distribution on . For GA we set , to allow us to visualize the convergence of the average iterate and the divergence of the last iterate on the same plot. For OGA we let for fastest convergence. We plot the trajectory of a single representative game simulation.

We demonstrate the performance of GA and OGA by plotting the trajectories of players’ strategies; to show this in , we take the -norm of each player’s strategies to form a three dimensional vector. We also plot the the sum of the squares of the distance of player strategies from the origin on a log scale. This is shown in Figure 1. From our results, it can be seen that GA diverges from the unique Nash of the game, while OGA converges to the unique Nash in last iterate. Notice that although the last iterate of OGA converges, it does so at a slower speed than the average iterate. The convergence in last iterate of OGA is not quite linear, but is upper bounded by a linear function, and hence does not contradict our theory.

### 4.2 Three Player Game with Smooth and Strongly Concave Payoffs

We next provide experiments showing convergence in a three player game with smooth and strongly concave payoffs. We set the payoffs for each player as follows:

 pij(x)=−12∥xi∥2+x⊤iCijxj+12∥xj∥2 (4) pi(x)=∑j∈V∖{i}pij(x)

Like in the game with linear payoffs, this game has a unique Nash at if and only if the determinant of is nonzero, which we can guarantee by sampling entries of the ’s uniformly at random.

As in the linear game, we initialize the entries of by sampling i.i.d. from the uniform distribution on , and the coordinates of i.i.d from the uniform distribution on . We set for both GA and OGA. The results are shown in Figure 2.

From our plots, it can be seen that both GA and OGA last iterates converge for the smooth and strongly concave game. Although OGA converges for both the game with linear payoffs and the game with smooth and strongly concave payoffs, the trajectory of GA and OGA take a more direct path to the Nash in the smooth and strongly concave game, as can be seen in the 3d trajectory. Furthermore, the last iterate of both GA and OGA follow a linear trend in the log scale, as predicted by our theory.

### 4.3 Effect of Number of Players on Convergence

In this section, we provide experimental results showing the effect of varying in a NZSG of players. For the smooth and strongly concave game, our theoretical upper bound has a linear dependence on the number of players in the game for both GA and OGA, and thus we test the dependence on players only in this setting.

We study a game with smooth and strongly concave payoffs, using the same payoffs as in Section 4.2 (see Equation (4)). We perform the same initializations as in Section 4.2 - initializing the entries of by sampling i.i.d. from the uniform distribution on , and the coordinates of i.i.d from the uniform distribution on . We set for both GA and OGA. We let the number of players range from 3 to 100, plotting convergence for each setting of players. We track convergence by plotting the number of iterates it takes for the the sum of the squares of the distance of player strategies from the origin to dip below . For each fixed number of players, we run ten trials to convergence and plot the average. The results are shown in Figure 3.

From this plot, we can see that the number of players affects the convergence rate of GA. However, for OGA, the effect of players on convergence disappears after enough players are introduced into the game. This suggests that the convergence rate for OGA in the smooth and strongly concave case may not be tight. This is an open question for future research.

## 5 Conclusion

In this paper, we studied the convergence of player strategies to equilibria in Network Zero-sum Games, a class of games that generalizes two-player zero-sum games and arises naturally in learning architectures that extend GANs. We show that many results in two-player zero-sum games on the convergence and divergence of these algorithms extend to NZSGs. We believe these results may guide practitioners working on extensions of GANs that involve more than two agents. Our results also shed some light on why existing extensions of GANs that employ more than two agents are successful in achieving convergent behaviour. Future research may search for models with more relaxed game theoretic assumptions where convergence can still be shown for reasonable algorithms. For example, the zero-sum assumption is absent from certain successful architectures, e.g. Wasserstein-GAN with Gradient Penalty [23].

## References

• [1]

Azizian, W., Scieur. D., Mitliagkas, I., Lacoste-Julien, S., Gidel, G.: A Tight and Unified Analysis of Gradient-Based Methods for a Whole Spectrum of Differentiable Games. The 23rd International Conference on Artificial Intelligence and Statistics (2020)

• [2] Balduzzi, D., Racanière, S., Martens, J., Foerster, J.N., Tuyls, K., Graepel, T.: The Mechanics of -Player Differentiable Games. 35th International Conference on Machine Learning, 363–372 (2018)
• [3] Balduzzi, D., Czarnecki, W.M., Anthony, T., Gemp, I., Hughes, E., Leibo, J., Piliouras, G., Graepel, T.: Smooth markets: A basic mechanism for organizing gradient-based learners. International Conference on Learning Representations (2020)
• [4] Brock, A., Lim, T., Ritchie, J., Weston, N.: Neural Photo Editing With Introspective Adversarial Networks. International Conference on Learning Representations, 1–15 (2017)
• [5] Cai, Y., Candogan, O., Daskalakis, C., Papadimitriou, C.: Zero-sum polymatrix games: A generalization of minmax. Mathematics of Operations Research 41(2), 648–655 (2016)
• [6] Cesa-Bianchi, N., Lugosi, G.: Prediction, learning, and games. Cambridge university press (2006)
• [7] Che, T., Li, Y., Jacob, A.P., Bengio, Y., Li, W.: Mode Regularized Generative Adversarial Networks. International Conference on Learning Representations (2017)
• [8] Chongxuan, L., Xu, T., Zhu, J., Zhang, B.: Triple generative adversarial nets. Advances in Neural Information Processing Systems, 4088–4098 (2017)
• [9] Daskalakis, C., Ilyas, A., Syrgkanis, V., Zeng, H.: Training GANs with Optimism. International Conference on Learning Representations (2018)
• [10] Daskalakis, C., Panageas, I.: Last-Iterate Convergence: Zero-Sum Games and Constrained Min-Max Optimization. 10th Innovations in Theoretical Computer Science Conference (ITCS 2019) 124, 27:1–27:18 (2018)
• [11] Daskalakis, C., Panageas, I.: The limit points of (optimistic) gradient descent in min-max optimization. Advances in Neural Information Processing Systems, 9236–9246 (2018)
• [12] Donahue, J., Krähenbühl, P., Darrell, T.: Adversarial Feature Learning. International Conference on Learning Representations (2017)
• [13] Gallier, J.: The Schur complement and symmetric positive semidefinite (and definite) matrices. Penn Engineering, 1–12 (2010)
• [14] Gidel, G., Berard, H., Vignoud, G., Vincent, P., Lacoste-Julien, S.: A Variational Inequality Perspective on Generative Adversarial Networks. International Conference on Learning Representations (2019)
• [15] Goodfellow, I.: NIPS 2016 tutorial: Generative adversarial networks. arXiv preprint arXiv:1701.00160 (2016)
• [16] Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. Advances in neural information processing systems, 2672–2680 (2014)
• [17] Hoang, Q., Nguyen, T.D., Le, T., Phung, D.Q.: MGAN: Training Generative Adversarial Nets with Multiple Generators. 6th International Conference on Learning Representations (2018)
• [18] Liang, T., Stokes, J.: Interaction Matters: A Note on Non-asymptotic Local Convergence of Generative Adversarial Networks. The 22nd International Conference on Artificial Intelligence and Statistics, 907–915 (2019)
• [19] Menache, I., Ozdaglar, A.: Network games: Theory, models, and dynamics. Synthesis Lectures on Communication Networks 4(1), 1–159 (2011)
• [20] Mertikopoulos. P., Lecouat, B., Zenati, H., Foo, C.S., Chandrasekhar, V., Piliouras, G.: Optimistic mirror descent in saddle-point problems: Going the extra(-gradient) mile. International Conference on Learning Representations (2019)
• [21] Mokhtari, A., Ozdaglar, A., Pattathil, S.: A unified analysis of extra-gradient and optimistic gradient methods for saddle point problems: Proximal point approach. The 23rd International Conference on Artificial Intelligence and Statistics (2020)
• [22] Vandenhende, S., De Brabandere, B., Neven, D., Van Gool, L.: A three-player GAN: generating hard samples to improve classification networks. 16th International Conference on Machine Vision Applications (MVA), 1–6 (2019)
• [23] Wei, X., Liu, Z., Wang, L., Gong, B.: Improving the Improved Training of Wasserstein GANs. International Conference on Learning Representations (2018)