Distributed Nash Equilibrium Seeking over Time-Varying Directed Communication Networks

01/07/2022
by   Duong Thuy Anh Nguyen, et al.
0

We study distributed algorithms for finding a Nash equilibrium (NE) in a class of non-cooperative convex games under partial information. Specifically, each agent has access only to its own smooth local cost function and can receive information from its neighbors in a time-varying directed communication network. To this end, we propose a distributed gradient play algorithm to compute a NE by utilizing local information exchange among the players. In this algorithm, every agent performs a gradient step to minimize its own cost function while sharing and retrieving information locally among its neighbors. The existing methods impose strong assumptions such as balancedness of the mixing matrices and global knowledge of the network communication structure, including Perron-Frobenius eigenvector of the adjacency matrix and other graph connectivity constants. In contrast, our approach relies only on a reasonable and widely-used assumption of row-stochasticity of the mixing matrices. We analyze the algorithm for time-varying directed graphs and prove its convergence to the NE, when the agents' cost functions are strongly convex and have Lipschitz continuous gradients. Numerical simulations are performed for a Nash-Cournot game to illustrate the efficacy of the proposed algorithm.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

09/10/2020

Nash equilibrium seeking under partial-decision information over directed communication networks

We consider the Nash equilibrium problem in a partial-decision informati...
03/22/2020

Fully distributed Nash equilibrium seeking over time-varying communication networks with linear convergence rate

We design a distributed algorithm for learning Nash equilibria over time...
07/27/2021

Gradient Play in n-Cluster Games with Zero-Order Information

We study a distributed approach for seeking a Nash equilibrium in n-clus...
01/27/2022

Efficient Distributed Learning in Stochastic Non-cooperative Games without Information Exchange

In this work, we study stochastic non-cooperative games, where only nois...
06/30/2021

Distributed Nash Equilibrium Seeking under Quantization Communication

This paper investigates Nash equilibrium (NE) seeking problems for nonco...
02/18/2021

Gradient-Tracking over Directed Graphs for solving Leaderless Multi-Cluster Games

We are concerned with finding Nash Equilibria in agent-based multi-clust...
11/24/2021

Finite-Time Error Bounds for Distributed Linear Stochastic Approximation

This paper considers a novel multi-agent linear stochastic approximation...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Game theory provides a framework to understand decision making in strategic situations where multiple agents aim to optimize their individual, yet interdependent, objective functions. The notion of Nash equilibrium (NE) in non-cooperative games characterizes desirable and stable solutions to the games, which subsequently can be used to predict the agents’ individual strategies and payoffs. A NE is a joint action from which no agent has an incentive to unilaterally deviate. Indeed, non-cooperative games have been extensively studied to address various engineering problems in different areas, such as communication networks, electricity markets, power systems, flow control, and crowdsourcing [zhan12, dyan16, Alpcan2005, BasharSG, Scutaricdma]. Hence, developing efficient NE seeking algorithms has drawn increasing attention in recent years. In this paper, based on the distributed gradient play approach, we develop a discrete-time algorithm to find a NE in a non-cooperative game played over time-varying directed communication networks.

In classical non-cooperative complete information game theory, the payoff of each agent is determined by its own actions and the observations of the other agents’ actions. Thus, a large body of existing work, using best-response or gradient-based schemes, requires each agent to know the competitors’ actions to search for a NE [Yu2017, Belgioioso2018, Yi2019]. However, this full-decision information assumption is impractical in many engineering systems [Salehisadaghiani2014], for example, the Nash-Cournot competition [Bimpikis2019]. Recently, there has been extensive research conducted on fully distributed algorithms, which rely on local information only (i.e., the partial-decision information setting [Belgioioso2019]), to compute NE. However, most of the proposed algorithms are built upon the (projected) gradient and consensus dynamics approaches, in both continuous time [Ye2017, Gadjov2019] and discrete time [Koshal2012, Tatiana2020]. Also, they are based on the information available to the agents and need certain properties of the agents’ cost functions, such as convexity, strong monotonicity, and Lipschitz continuity.

In [Salehisadaghiani2014], the authors propose a gradient-based gossip algorithm for distributed NE seeking in general non-cooperative games. For a diminishing stepsize, this algorithm converges almost surely to a NE under strict convexity, Lipschitz continuity, and bounded gradient assumptions. With the further assumption of strong convexity, a constant stepsize guarantees the convergence to an neighborhood of the NE. In [SALEHISADAGHIANI201927] an algorithm within the framework of the inexact-ADMM is developed and its convergence rate is established for a fixed stepsize under the co-coercivity assumption on the game mapping. Reference [Tatiana2018] provides an accelerated version of the gradient play algorithm (Acc-GRANE) for solving variational inequalities. The analysis is based on strong monotonicity of a so-called augmented mapping which takes into account both the gradients of the cost functions and the communication settings. However, this algorithm is applicable only to a subclass of games characterized by a restrictive connection between the agents, Lipschitz continuity, and strong monotonicity constants. By assuming the restricted strong monotonicity of the augmented game mapping, in [Tatarenko2021], the authors show that this algorithm can be applied to a broader class of games and demonstrate its geometric convergence to a NE. However, both types of the procedures mentioned above require a careful selection of both stepsize and the augmented mapping. Alternatively, by leveraging contractivity properties of doubly stochastic matrices, in [Tatiana2020], the authors develop a distributed gradient-play based scheme whose convergence properties do not depend on the augmented mapping. Nevertheless, all the methods cited above are designed for time-invariant undirected networks.

There is a growing interest in studying NE seeking for communication networks with switching topologies. The early works [Koshal2012, Koshal2016] consider aggregative games over time-varying, jointly connected, and undirected graphs. This result is extended in [Grammatico2021] to games with coupling constraints. In [Farzad2019]

, an asynchronous gossip algorithm to find a NE over a directed graph is developed under the assumption that every agent is able to update all the estimates of the agents who interfere with its cost function. In 

[Bianchi_2021], a projected pseudo-gradient based algorithm is proposed that works for time-varying, weight-balanced, and directed graphs. The balancedness assumption is relaxed in follow-up work [Bianchi2020NashES], where a modified algorithm is proposed which requires global knowledge of the communication graph structure, including the Perron-Frobenius eigenvector of the adjacency matrix and a constant related to the graph connectivity. However, constructing weight balanced matrices even in a directed static graph is non-trivial and computationally expensive [Gharesifard2012], making it impractical for time-varying directed graphs. Also, the knowledge of the global communication structure is a demanding assumption since the computation of the Perron-Frobenius eigenvector in every iteration imposes significant computational burden.

Contributions. Motivated by the penetration of the game theoretic approaches into cooperative control and distributed optimization problems in engineering systems where full communication is not available [Belgioioso2019, Tatiana2018, Bianchi_2021, Tatiana2020], this paper addresses NE seeking under the so-called partial-decision information scenario. Agents only have access to their own cost functions and local action sets, and engage in nonstrategic information exchange with their neighbors in a network. Our contributions are summarized as follows:

  • We propose a fully-distributed algorithm to compute a NE over time-varying directed communication networks. While previous works assumed balancedness, or the knowledge of some global communication network parameters, our approach only requires the usual row-stochasticity assumption on the weights. The algorithm is simple to implement in a distributed fashion as each agent can locally decide on the weights for the information received from its neighbors.

  • The proposed algorithm does not depend on any parameter related to the network structure, such as the Perron-Frobenius eigenvector of the adjacency matrix or any other constant related to the graph connectivity. Moreover, the convergence analysis of our approach does not rely on the augmented mapping used in [Tatiana2020, Tatarenko2021]; instead, the convergence of the method is focused on the choice of the stepsize values.

  • We prove that the algorithm is guaranteed to converge to a NE under mild assumptions of convexity, strong monotonicity, and Lipschitz continuity of the game mapping.

Ii Notations and Terminologies

Throughout this paper, all vectors are viewed as column vectors unless stated otherwise. We consider real normed space

, which is either space of real vectors or the space of real matrices . For every vector , is the transpose of . We use to denote the inner product, and to denote the standard Euclidean norm. We write and to denote the vector with all entries equal to 0 and 1, respectively. The dimensions of the vectors and are to be understood from the context.

In this paper, we consider a discrete time model where the time index is denoted by . The -th entry of a vector is denoted by , while it is denoted by for a time-varying vector . Given a vector , we use and to denote the smallest and the largest entry of , respectively, i.e., and . We write to indicate that the vector has positive entries. A vector is said to be a stochastic vector if its entries are nonnegative and sum to 1. For a set with finitely many elements, we use to denote its cardinality.

To denote the -th entry of a matrix , we write , and we write when the matrix is time-dependent. For any two matrices and of the same dimension, we write to denote that , for all and ; in other words, the inequality is to be interpreted component-wise. A matrix is said to be nonnegative if all its entries are nonnegative. For a nonnegative matrix , we use to denote the smallest positive entry of , i.e., . A nonnegative matrix is said to be row-stochastic if each row entries sum to 1, and it is said to be column-stochastic if each column entries sum to 1. In particular, if is row-stochastic and is column-stochastic, then and .

We call a matrix consensual

, if it has equal row vectors. The largest and smallest eigenvalues in modulus of a matrix

are denoted as and , respectively. For any matrix , we use to denote its diagonal vector, i.e. . For any vector we use to denote the diagonal matrix with the vector on its diagonal.

Given a vector with positive entries , the -weighted inner product and -weighted norm are defined, respectively, as follows:

and

where , and .
When , we simply write , for which we have:

(1)

Furthermore, using the Cauchy–Schwarz inequality, we obtain:

(2)

Thus, the Cauchy–Schwarz inequality holds for the -weighted inner product and the -weighted norm.

A mapping is said to be strongly monotone on a set with the constant , if for any , where is the corresponding norm in . A mapping is said to be Lipschitz continuous on a set with the constant , if .

We let denote the set for an integer . A directed graph is specified by the set of edges

of ordered pairs of nodes. Given two distinct nodes

(), a directed path from node to node in the graph is a finite (ordered) sequence of edges passing through distinct nodes, where , and for all .

Definition 1 (Graph Connectivity).

A directed graph is strongly connected if there is a directed path from any node to all the other nodes in the graph.

Given a directed path, the length of the path is the number of edges in the path.

Definition 2 (Graph Diameter).

The diameter of a strongly connected directed graph is the length of the longest path in the collection of all shortest directed paths connecting all ordered pairs of distinct nodes in .

We denote the diameter of the graph by .

In what follows, we consider special collections of shortest directed paths which we refer to as shortest-path covering of the graph. Let denote a shortest directed path from node to node , where .

Definition 3 (Shortest-Path Graph Covering).

A collection of directed paths in is a shortest-path graph covering if and for any two nodes , .

Denote by the collection of all possible shortest-path coverings of the graph .

Given a shortest-path covering and an edge , the utility of the edge with respect to the covering is the number of shortest paths in that pass through the edge . Define as the maximum edge-utility in taken over all edges in the graph, i.e.,

where is the indicator function taking value 1 when and, otherwise, taking value 0.

Definition 4 (Maximal Edge-Utility).

Let be a strongly connected directed graph. The maximal edge-utility in the graph is the maximum value of taken over all possible shortest-path coverings , i.e.,

As an example, consider a directed-cycle graph of the nodes . Then, .

Given a directed graph , we define the in-neighbor and out-neighbor set for every agent , as follows:

When the graph varies over time, we use a subscript to indicate the time instance. For example, will denote the edge-set of a graph , and denote the in-neighbors and the out-neighbors of a node , respectively. In our setting here, the agents will be the nodes in the graph, so we will use the terms ”node” and ”agent” interchangeably.

Iii Problem Formulation

We consider a non-cooperative game between agents. For each agent , let and be the cost function and the action set of the agent. Let be the size of the joint action vector of the agents. Each function depends on and , where is the action of the agent and denotes the joint action of all agents except agent .

Denote the game by . A solution to the game is a Nash equilibrium (NE) such that for every agent , we have:

(3)

When for every agent , the action set is closed and convex, and the cost function is also convex and differentiable in for each a NE of the game can be alternatively characterized by using the first-order optimality conditions. Specifically, is a NE of the game if and only if for all , we have:

(4)

Using the Euclidean projection property, it can be seen that the preceding relation is equivalent to:

(5)

where is an arbitrary scalar. By stacking the relations in (5), we can rewrite them in a compact form. By this way, is a NE for the game if and only if:

(6)

where is the agents’ joint action set and is the scaled gradient mapping of the game, defined by

(7)

where for all .

In the absence of constraints on the agents’ access to each others’ actions, an NE point can be computed by implementing a simple iterative algorithm (see [FacchineiPang]). In particular, starting with some initial point , each agent updates its decision at time as follows:

(8)

This algorithm is guaranteed to converge to a NE under suitable conditions. However, it requires that every agent has access to all other agents’ decisions at every time .

Iii-a Graph-constrained Agents’ Interactions

In this paper, we focus on the setting where the agents’ interactions over time are constrained by a sequence of directed time-varying communication graphs. When the agents interact at time , their interactions are constrained by a directed graph , where the set of nodes is the agent set and is the set of directed links. The directed link indicates that agent can receive information from agent .

Given that our game has constraints on agents’ access to actions of other agents, we consider an adaptation of the basic algorithm (8) that will obey the information access as dictated by the graph at time . In the absence of the access to , agent will use an estimate instead, which leads to the following update rule for each agent :

(9)

where the vector is some estimate (based on what neighbors think the actions of agent are) and consisting of the estimates that agent has about the true decision vector for every agent . Notice that we have the index for the estimates and since they are constructed at time upon information exchange among the neighbors in the graph .

In this situation, need not belong to the set at any time , as the other agents may not know this set. Also, as agent does not know the action space , the estimate need not lie in the set . Thus, the function should be defined on the set , where Specifically, regarding the agents’ cost functions and their action sets, we use the following assumptions:

Assumption 1.

Consider the game , and assume that for all :

  • The mapping is Lipschitz continuous on for every with a uniform constant .

  • The mapping is Lipschitz continuous on for every with a uniform constant .

  • The mapping is strongly monotone on for every with a uniform constant .

  • The set is nonempty, convex, and closed.

Remark 1.

Under Assumption 1, a NE point exists and it is unique (Theorem 2.3.3 of [FacchineiPang]). Moreover, it can be equivalently captured as the fixed point solution (see (6)). The differentiability of the cost functions on a larger range of is assumed to ensure that the algorithm (9) is well defined.

Iv Distributed Algorithm

We consider the distributed algorithm over a sequence of underlying directed communication graphs. We assume that every node has a self-loop in each graph , so that the neighbor sets and contain agent at all times. Specifically, we use the following assumption.

Assumption 2.

Each graph is strongly connected and has a self-loop at every node .

For each , each agent has a column vector , where is agent ’s estimate of the decision for agent , while . Let be the estimate of agent without the -th block-component. Hence, consists of the decision of agent and the estimate of agent for the decisions of the other agents.

At time , every agent sends to its out-neighbors and receives estimates from its in-neighbors . Once the information is exchanged, agent computes that is an estimate of based on what the in-neighbors think the actions of agent is, and, the estimate . Then, agent updates its own action accordingly. Intuitively, using the estimates based on the information gathered from neighbors can improve the accuracy of the estimates including the estimate of its own action since more information is taken into account. The agents’ estimates are constructed by using a row-stochastic weight matrix that is compliant with the connectivity structure of the graph , in the sense that:

(10)

Note that every agent controls the entries in the th row of , which does not require any coordination of the weights among the agents. In fact, balancing the weights [Bianchi_2021] would require some coordination among the agents or some side information about the structure of the graphs , which we avoid imposing in this paper.

The estimate of is constructed based on the information that receives from its in-neighbors with the corresponding weights, as follows:

(11)

Agent estimate of other agents’ actions is computed as:

(12)

Finally, using these estimates, agent updates its own action according to the following formula

The procedure is summarized in Algorithm 1.

Algorithm 1: Distributed Method Every agent selects a stepsize and an arbitrary initial vector . for every agent does the following:   - Receives from in-neighbors ;   - Sends to out-neighbors ;   - Chooses the weights ;   - Computes the estimates and by       , and       ;   - Updates action by       ;   - Updates the estimate by ; end for.

We make the following assumption on the matrices .

Assumption 3.

For each , the weight matrix is row-stochastic and compatible with the graph i.e., it satisfies relation (10). Moreover, there exist a scalar such that for all .

V Basic results

In this section, we provide some basic results related to norms of linear combinations of vectors, graphs, stochastic matrices, and the gradient method.

V-a Linear Combinations and Graphs

Since the mixing terms used in Algorithm 1 are special linear combination of , we start by establishing a result for linear combinations of vectors. In particular, in the forthcoming lemma, we provide a relation for the squared norm of a linear combination of vectors, which will be used in our analysis with different identifications.

Lemma 1.

Let be a collection of vectors and be a collection of scalars. Then, the following statements are valid:
(a) We have:

(b) If holds, then for all we have:

Proof.

(a) For , we have:

Using the identity

which is valid for any two vectors and , we obtain:

(13)

Note that

we further obtain that

Therefore, by substituting the preceding equality in relation (13) we find that:

The first two terms in the preceding relation give:

implying that:

(14)

The relation in part (a) follows by noting that the sum does not change when we add the terms for , since they are all zero.

(b) Suppose now that . Then, for any vector , we have:

We apply the relation from part (a) where is replaced with , and by using the fact that , then for all , we obtain:

 

We have the following result as an immediate consequence of Lemma 1(b).

Corollary 1.

Choosing in Lemma 1(b) yields, in particular, that

(15)

Substituting the preceding relation back in the relation in part (b) of Lemma 1 gives for all ,

(16)

Relation (16) is valid when . If additionally, the scalars are non-negative, then relation (16) coincides with the well known relation for weighted averages of vectors.

There are certain contraction properties of the distributed method, which are inherited from the use of the mixing term , and the fact that the matrix is compliant with a directed strongly connected graph . Lemma 1 provides a critical result that will help us capture these contraction properties. However, Lemma 1 alone is not sufficient since it does not make any use of the structure of the matrix related to the underlying graph .

The graph structure is exploited in the forthcoming lemma for a generic graph. More specifically, the lemma provides an important lower bound on the quantity for a given directed graph , where is a vector associated with a node . The lower bound will be applied to the graph at time , which provides the second critical step leading us toward the contraction properties of the iterate sequences.

Lemma 2.

Let be a strongly connected directed graph, where a vector is associated with node for all . Let be a shortest path covering of the graph . We then have:

where is the diameter of the graph and is the maximal edge-utility in the graph (see Definitions 2 and 4).

Proof.

Let where is a shortest path from node to node . Let be the collection of all directed links that are traversed by any path in the shortest path covering , i.e.,

By the definition the maximal edge-utility (see Definition 4), we have:

where is the maximal edge-utility with respect to the shortest path covering , i.e.,

Note that when a link is not used by any of the paths in . Thus, the value is equivalently given by:

Consider the quantity . Since , it follows that

Since the maximal edge-utility with respect to paths in the collection , it follows that for any link which is used by a path in . Hence,

(17)

Note that the sum on the right hand side in the preceding relation is taken over all links in with the multiplicity with which every is used in the shortest path covering . Thus, it can be written in terms of the paths in which are connecting distinct nodes , as follows:

(18)