I-a Motivation: Multi-agent decision making over networks
In multi-agent decision making over networks, all the decision makers, in short, agents, share their information only with a selected number of agents. In particular, the agents’ state (or decision) is the result of a local decision making process, e.g. a constrained optimization problem, and a distributed communication with the neighboring agents, defined by the communication network. In many problems, the goal of the agents is reaching a collective equilibrium state, where no agent can benefit from changing its state. The local interation between the agents is exploited in opinion dynamics to model the evolution of a population’s collective opinion as an emerging phenomenon of the local interactions, see [1, 2, 3]. Another interesting consequence of the communication structure is that the agents keep their own data private, exchanging information only with selected agents. This characteristic is of particular interest in, for example, traffic and information networks problems  or in the charging scheduling of electric vehicles [5, 6]. This class of problems arises also in other applications, e.g., in smart grids [7, 8] and sensor network , .
I-B Literature overview: Multi-agent optimization and multi-agent network games
In this work, we study a particular instance of the problem introduced above, namely a multi-agent network game, where the communication network and the constraints between the agents are both time-varying. Multi-agent network games arise from the well established field of distributed optimization and equilibrium seeking over networks. In the past years, several results were proposed for optimization problems subject to a time-varying communication network: in  the subgradients of the cost functions are bounded and the communication is described by a strongly connected sequence of directed graphs, while in  the cost functions are assumed to be continuously differentiable and a linearly convergent algorithm is designed under the assumption of a time-varying undirected communication network. Another approach, explored in , is to construct a game, whose emerging behavior solves the optimization problems. In this case, the cost functions are differentiable and the communication ruled by an undirected time-varying graph connected over time.
The problem of noncooperative multi-agent games, subject to coupling constrains, was firstly studied in , under the assummptions of continuosly differentiable cost functions and no network structure between the agents. In the past years, several researchers focused on this class of problems providing many results for games over networks, e.g., in [15, 16, 5] where the communication network is always assumed undirected, while the cost functions are chosen either differentiable or continuously differentiable. Moreover, some authors also focused on the class of noncooperative games over time-varying communication network, in particular on the unconstrained case. For example, in  differentiable and strictly convex cost functions with Lipschitz continuous gradient were considered, where the sequence of time-varying communication networks was repeatedly strongly connected, and the associated adjacency matrices doubly stochastic.
I-C Paper contribution
A complete formulation of multi-agent network games, subject to proximal type dynamics, can be found in  where the unconstrained case is studied for a time-varying strongly connected communication network, described by a doubly stochastic adjacency matrix. In [18, 19], the condition on the double stochasticity of the adjacency matrix was relaxed, in the first case by means of a dwell time. Notice that these types of games can also be rephrased as paracontracions; in this framework, the work in  provided convergence for repeatedly jointly connected digraphs. Iterative equilibrium seeking algorithms were developed for constrained multi-agent network games in [8, 19] under the assumption of a static communication network.
In this work, we aim to address the problem of a constrained multi-agent network games subject to a time-varying communication network. In particular, we first discuss the convergence of the game and motivate the technical assumption needed to ensure the existence of an equilibrium, and then we develop an equilibrium seeking algorithm that achieves global convergence for the game at hand. The main difference with the work in  is the presence of both time-varying communication network and time-varying constraints, and this generalization leads to several technical challenges, requiring a more involved convergence analysis.
Ii-a Basic notation
The set of real, positive, and non-negative numbers are denoted by , and , respectively; . The set of natural numbers is denoted by . For a square matrix , its transpose is denoted by , denotes the -th row of the matrix, and the element in the -th row and -th column.Also, () stands for a symmetric and positive definite (semidefinite) matrix, while () describes an element wise inequality. is the Kronecker product of the matrices and
. The identity matrix is denoted by, and (
) represents the vector/matrix with only() elements. For and , the collective vector is denoted as and . Given the operators , denotes a block-diagonal operators with as diagonal elements. The Cartesian product of the sets is described by . Given two vectors and a symmetric and positive definite matrix , the weighted inner product and norm are denoted by and , respectively; the induced matrix norm is denoted by . A real dimensional Hilbert space obtained by endowing with the product is denoted by .
Ii-B Operator-theoretic notations and definitions
The identity operator is defined by . The indicator function of is defined as if ; otherwise. The set valued mapping stands for the normal cone to the set , that is if and otherwise. The graph of a set valued mapping is . For a function , define and its subdifferential set-valued mapping, , . The projection operator over a closed set is and it is defined as . The proximal operator is defined by . A set valued mapping is -Lipschitz continuous with , if for all ; is (strictly) monotone if for all holds, and maximally monotone if there is no monotone operator with a graph that strictly contains ; is -strongly monotone if for all it holds . denotes the resolvent mapping of . Let and denote the set of fixed points and zeros of , respectively. The operator is -averaged (-AVG) in , with , if , for all ; is nonexpansive (NE) if -AVG; is firmly nonexpansive (FNE) if -AVG; is -cocoercive if is -AVG (i.e., FNE). The operator belongs to the class in if and only if and for all and it holds . Several type of operators belongs to this class, e.g. FNE operators and the resolvent of a maximally monotone operator. We refer to  for more properties of operators of class .
Iii Mathematical setup and problem formulation
Iii-a Mathematical formulation
We consider players (or agents) taking part in a game. A constrained network game is defined by three main components: the constraints each players has to satisfy, the cost functions to be minimized and the communication network.
The constraints can be divided in two types: local and coupling. At every time instant , each agent adopts an action (or strategy) belonging to its local feasible set , i.e., the collection of those strategies meeting its local constraints. We assume that this set is convex and closed.
Standing Assumption 1 (Convexity)
For every , the set is non-empty, compact and convex.
The agents are also subject to time-varying affine and separable coupling constraints, that generate an entanglement between the strategy chosen by player and those of the others. For an agent , at time instant , the time-varying set of strategies satisfying the coupling constraints, given the other agents’ strategies , reads as
where and .
In the following, we refer to the collective vector as the strategy profile of the game. All the strategies profiles that satisfy both the local and coupling constraints determine the collective feasible decision set, defined as
where and .
Standing Assumption 2
For all and , the collective feasible decision set satisfies Slater’s condition.
All the players in the network are assumed myopic and rational, and thus each agent aims only at minimizing its local cost function . The myopic nature of the agents is reflected in the argument of the cost function that depend only on the current strategies of the players (as we will clarify in the following). In this work, we assume that the cost function have the proximal structure, as defined next.
Standing Assumption 3 (Proximal cost functions)
For all , the function is defined as
where the function is convex and lower semi-continuous.
The cost function is composed of two parts, is the local part and has a double role: describing the local objective of agent , via , and ensuring that the next strategy belongs to , through the indicator function . The quadratic part of works as a regularization term and penalizes the distance of the local strategy from . It is also responsible for the strict-convexity of , even though is only lower semi-continuous, see [22, Th. 27.23].
Before providing a formal description of the second argument in the cost function, let us introduce the time-varying communication network adopted by the agents. We assume that, at each time instant , it is described by a strongly connected digraph, defined via the couple . The set represents the nodes of the graph that are the players in the game, i.e., , so this set does not vary over time. The matrix denotes the adjacency matrix of the digraph, at time , where . For every , is the weight that agent assigns to the strategy of agent . If , then agent does not communicate with agent . The set of all the neighbors of agent is defined as . The following assumption formalizes the properties of the adjacency matrix required throughout this work.
Standing Assumption 4 (Row stochasticity and self-loops)
At every time instant , the communication graph is strongly connected. The matrix is row stochastic, i.e., for all , and , for all . Moreover, has strictly-positive diagonal elements, i.e., .
For each agent , the term in (2) represents an aggregative quantity defined by
and hence it is the average of the neighbors’ strategies, weighted via the adjacency matrix . So, the actual cost function of agent at time is .
As mentioned before, the agents are considered rational, thus their only objective is to minimize their local cost function, while satisfying the local and coupling constraints. The dynamics describing this behavior are the myopic best response dynamics, defined, for each player , as:
The interaction of the players, using dynamics (3), can be natuarally formalized as a noncooperative network game, defined, for all , as
where we omitted the time dependency of and to ease the notation.
Iii-B Equilibrium concept and convergence
For the game in (4), the concept of equilibrium point is non trivial. A popular equilibrium notion for constrained game is the, so called, generalized network equilibrium (GNWE). Loosely speaking, a profile strategy is a GNWE of the game, if no player can change its strategy to another feasible one while decreasing . Notice that, if does not have self-loops, GNWE boils down to generalized Nash equilibrium, see .
This idea of equilibrium cannot be directly applied to (4) and in fact every variation in the communication network generates a different game, with its own set of GNWE. Therefore, the equilibria in which we are interested are those invariant to the changes in the communication; they take the name of persistent GNWE (p–GNWE).
Definition 1 (persistent GNWE)
A collective vector is a persistent GNWE (p–GNWE) for the game (4), if there exists some , such that for all ,
We have defined both the game and the set of equilibria we are interested in. Let us now elaborate on the convergence properties of the game in (4), providing three examples highlighting different aspects of these dynamics. By means of the first two examples, we show, first that the dynamics in (3) can fail to converge to an equilibrium point, even in the case of a static communication network, where the existence of a GNWE is guaranteed by [8, Prop. 4] and then that the existence of p–GNWE is not guaranteed. Finally, the last example shows a case where the game in (4) converges.
Example 1 (non–convergence)
Consider a 2-player constrained game, where, for , and the local feasible decision set is defined as and does not vary over time. The collective feasible decision set is convex and reads as , hence the game is jointly convex. The dynamics of the game are as in (3), and can be rewritten in closed form as the discrete-time linear system:
which is not globally convergent, e.g., consider .
Example 2 (equilibirum existence)
Consider a 2-player game without local or coupling constraints and scalar strategies. The communication network can vary between the two graphs described respectively by the adjacency matrices and . The cost functions of the agents are in the form of (2), where the local part is chosen as , for . For each one of the communication networks, there exists only one equilibrium point of the game, i.e., and , when respectively or is adopted. Therefore the set of p–GNWE of the game is empty, leading the dynamics to oscillate between and .
Example 3 (convergence)
Once again, consider the a 2-player game, where for a player the local feasible set is and . The collective feasible decision set is defined as
where . We choose satisfying Standing Assumption 4 and it is doubly stochastic, for every time instant . If the strategy profile belongs to the consensus subspace , both agents achieve the minimum of their cost function, and therefore all those points are equilibria of the unconstrained game. Furthermore, for the set , it always holds that , and hence they are p-GNWE of the game. Assume that at , , then, for all , the dynamics reduce to , therefore the profile strategy will converge to a point in , i.e., to a p–GNWE of the game.
Iii-C Primal–dual characterization
As illustrated in Example 1, the myopic constrained dynamics in (3) can fail to converge, and thus we recast them as pseudo collaborative ones. The idea is that each player will minimize its own cost function, while at the same time coordinate with the others to satisfy the constraints. With this approach, we aim to achieve asymptotic fulfillment of the coupling constraints. As a first step, we dualize the dynamics introducing, for each player , a dual variable . The arising problem is an auxiliary (extended) network game, see [23, Ch. 3]. The collective vector of the dual variables is denoted by . The equilibrium concept is adapted to this modification in the dynamics, so we define the persistent Extended Network Equilibrium (p–ENWE).
Definition 2 (persistent Extended Network Equilibrium)
The pair , is a p–ENWE for the game in (4) if there exists such that, for every ,
In the following, we assume the presence of a central coordinator facilitating the synchronization between agents. This approach aligns with the new pseudo-collaborative behaviors of the agents, which is widely used in the literature. The central coordinator broadcasts an auxiliary variable to each agent , that, in turn, uses this information to compute its local dual variable . Specifically, at every time instant , the agent scales the received variable , by a possibly time-varying factor , attaining in this way its local dual variable, i.e., . The scaling factors describe how the burden of satisfying the constraints are divided between the agents, hence . If , for all , then the effort to satisfy the couplying constraints is fairly splitted between the agents, this case is considered in several works, e.g., [5, 15, 24]. This class of problems was introduced for the first time in the seminal work by Rosen , where the author formulates the concept of normalized equilibrium. We adapt this idea for the problem at hand, defining the persistent normalized extended network equilibrium (pn-ENWE).
Definition 3 (persistent normalized-ENWE)
The pair , is a pn–ENWE for the game in (4), if it exists , such that for all it satisfies
The following lemma shows that a pn–ENWE is also a p–GNWE, and vice versa.
Lemma 1 (p–GNWE as fixed point)
We omit the demonstration of the lemma, since it is analogous to that in [8, Lem. 2].
Iii-D On the existence of persistent equilibria
We devote the remainder of the section to a more in depth analysis of the problem of the existence of a p–GNWE for the game in (4). In general, there is no guarantee that such an equilibrium exists, as shown in Example 2. The literature dealing similar problems is split on how to handle this problem. Namely, two possible assumptions can be adopted to proceed with the analysis. The first one supposes a priori the existence of at least one p–GNWE in the game. This assumption does not restrict the problem at hand, since the convergence can be established only for the cases in which it is satisfied. However, it can be difficult to check if this assumption holds in practice. This approach is the one chosen in this work and it is usually adopted when the focus is more on theoretical results, see [22, Cor. 5.19], [26, Prop. 3.1], [8, Ass. 3] and [19, Ass. 6].
Standing Assumption 5 (Existence of a pn-ENWE)
The set of pn-ENWE of (4) is non-empty, hence .
On the other hand, the second assumption considers only those games in which the local cost functions share at least one common fixed point. This implies that at least one point in the consensus subspace is an equilibrium invariant to the change of the communication network. If, at the same time, this point is also feasible, then it is a p–GNWE of the game. This assumption is clearly stronger than the previous one. Nevertheless, it is easier to verify in practice, since it only requires the analysis of the cost functions of the agents, as shown in Example 3. Mainly for this reason, it is widely spread throughout the literature, where it is either implicitly verified as in  or explicitly required [20, Ass in Th. 2] .
Iv Convergence result
Next, we propose the main result of this paper, an iterative and decentralized algorithm converging to a pn-GNWE of the game in (4). We call it TV–Prox–GNWE and it is reported in (9a)–(9d), while its complete derivation is described in the Appendix.
In order to provide the bounds for the choices of the parameters in the algorithm, let us redefine the matrix via a diagonal matrix, an upper and a lower triangular matrix, i.e., , where and always have zeros diagonal elements. For each time instant , the parameters in TV–Prox–GNWE are set such that, the following inequalities hold:
where and , with being the
-th element of the left Perron-Frobenius eigenvector of. Also in this case, we omitted the time dependency of the matrices to ease the notation. The bounds in (10c) – (10d) implicitly lead to a condition on the maximum value of the step size , namely .
The TV–Prox–GNWE in (9), is composed of three main steps: a proximal gradient descend, performed by every agent (9a), a dual ascend done by the central coordinator (9b) and correction step, in (9c) – (9d), to balance the asymmetricity of the weights in the directed network, i.e., .
The main technical result of the paper is the following theorem, where we establish global convergence of the sequence generated by the TV–Prox–GNWE to a p-GNWE of the game in (4).
In this section, we adopt TV–Prox–GNWE to solve a problem of constrained consensus. We consider a game with agents, where the strategy of every agent is , and its local feasible decision set is , with and randomly drawn respectively from and . The local cost function is equal to . The adjacency matrices, descibing the communication network at every time instant , are randomly generated and define digraphs of the type small-word, satisfying Standing Assumption 4. The coupling constraints are used to force the strategies towards the consensus subspace and are in the form , for every , where and it is decreasing over time. Notice that in this case the multiplier graph is complete, see . Finally, the parameters of the algorithm are chosen such that they always satisfy (10).
The trajectory of the profile strategy generated by TV–Prox–GNWE converges to the consensus subspace, this is shown in Fig. 1, by means of the Laplacian matrix of the multiplier graph. The initial strategy profile is randomly chosen in . As expected from the result in Theorem 1, the constraints are satisfied asymptotically, see Fig. 2.
Vi Conclusion and outlook
In multi-agent network games, subject to time-varying coupling constraints and time-varying communication network, described by strongly connected digraphs, agents can fail to converge when they adopt proximal dynamics. Nevertheless, it is developed an iterative equilibrium seeking algorithms (TV–Prox–GNWE) that ensures the global convergence of the agents’ strategies to an normalized equilibrium of the game, when it exists.
One of the most important open question in these type of problems regards the existence of an equilibrium point. This work can be improved with a new assumption for the equilibrium existence, which is general and easy to check.
-a Algorithm derivation
In this section, we propose the complete derivation of the iterative algorithm that we called TV–Prox–GNWE. We divide the derivation in two mains steps
Modified proximal point algorithm
-A1 Equilibria reformulation
the set of pn-ENWE, defined by the two equalities in (8), can be equivalently rephrased as the set of fixed points of a suitable mappings. First, we introduce the block-diagonal proximal operator
In (8), the first equality is equivalent to
where and . The second equality holds true if and only if .
In order to describe via operators these two relations, we define the static mappings
and the time-varying affine one as
As a result, the dynamics of the game result equal to
We exploit this new compact form to describe the set of pn–ENWE via the fixed points of . In particular, by Definiton 3, a pair is a pn–ENWE of the game in (4) if and only if . Furthermore, from Lemma 1, we also know that a pn–ENWE is a p–GNWE of the original game. So, we focus on the design of an algorithm converging to the subset for which we can take advantage of this new formulation.
A useful tool to solve fixed point seeking problem is to reformulate it as a zero finding problem, as done in the next lemma, see [28, Ch. 26].
Lemma 2 ([28, Prop. 26.1 (iv)])
Let , with . Then,
-A2 Modified proximal point algorithm
we describe in details the passages to develop the iterative algorithm solving the zero finding problem associated to the operator , and, as a consequence, the original one of finding pn–ENWE of (4). We adopt a modified version of the proximal point algorithm (PPP) (see [28, Prop. 23.39] for its standard formulation). In particular, the update rule is a preconditioned version of the PPP algorithm proposed in [29, Eq. 4.18], after defining and , it can be rewritten as
where and is the step–size of the algorithm and . The preconditioning matrix is chosen as
where and . The self-adjoint and skew symmetric components are defined as and . Due to the non symmetric preconditioning the resolvent operator takes the form
The parameters and in the preconditioning have to be chosen such that and . This can be done via the Gerschgorin Circle Theorem for partitioned matrices (more stringent but more involved bounds can be obtained via [30, Th. 2.1]). The resulting bounds are reported in (10).
Finally, we are ready for the complete derivation of the algorithm by explicitly compute the local update rules of the agents and of the central coordinator. We omit the time dependency in the following formulas.
First, we focus on (15a) , so
By solving the first row block of (17d), i.e. , we obtain
Let us define, with a small abuse of notation, the matrix , then we attain
The second row block instead reads as , and leads to
-B Convergence proof of TV–Prox–GNWE
In order to simplify the proofs proposed in the following, let us introduce some useful definition that will be adopted thorough the whole section. We define the two scalars and , the former is the Lipschitz constant of and the latter is such that , this also implies that . Next, we define the time-varying matrix and the scalars , and . Notice that, without loss of generality, we can always choose the normalized version of the left Perron Frobenius eigenvector of the matrix , so .
In the following, we also omit the time dependency of the operators when this does not lead to ambiguities. The proofs follow similar steps to the ones in [29, Prop.2.1 and 4.2], where the case of a static communication network is considered.
For all , consider the time-varying operator
then the following hold:
is quasi-nonexpansive in the space ,
if , then .