Tracing back from the seminal work by Monderer and Shapley [monderer1996potential], potential games represent a broad class of noncooperative games characterized by the existence of a real-valued function, namely, the potential function, such that any collective strategy profile minimizing the underlying function coincides with a Nash equilibrium of the game. Potential games hence provide a means to naturally model many control-theoretic applications [Mar09] such as routing [Orda93], complex social networks [Staudigl:2011] and Cournot competition [ConKluKraw04].
In this paper, we consider ordinal potential games in which part of the decision variables of the agents are constrained to assume integer values. Such MI games have been recently proposed as strategic models for the distributed coordination of autonomous vehicles [fabiani2018mixed, fabiani2019multi], transportation and traffic control [cenedese2021highway], and smart grids [Vujanic:2016wr, cenedese2019charging]. Also in market games [Gabriel:2013us, Sag17] and combinatorial congestion games [Rosenthal:1973wg, KleSch21], MI restrictions are often encountered since, for example, a prescribed good shall be produced in fixed proportions, or packets have to be sent in integral units.
A crucial issue affecting nonconvex games in general is to establish the existence of an equilibrium. Thus, several approaches have been proposed to overcome this intrinsic technical challenge for NEP with MI variables. As an example, [harks2021generalized] constructed a set of convexified instances for the MI game to make connections between (generalized) MI-NE of the original game and its convexification. Within this approach, one can indeed derive existence certificates for specific classes of games via duality theory.
In this paper, we circumvent the existence issue by focusing on a class of ordinal potential games for which existence results can be obtained by assuming that a certain master problem admits an integer-feasible solution. For such a restricted, but practically relevant, MI game-theoretic setting, we present distributed equilibrium seeking algorithms within a proximal-based framework. In particular, we consider two scenarios: i) the underlying game is generalized ordinal and the agents update their control variables iteratively by choosing an exact proximal BR strategy; ii) the game admits an exact potential function but we allow agents to choose an inexact proximal BR for their updates.
Similar to Bregman-versions of proximal algorithms [DvuShtStaSurvey21], we formulate the proximal best-response function using a class of norm-like regularizers, known in the MI optimization community as ICRF. Instead of standard quadratic regularizations, we choose ICRF as penalty functions since we believe they could provide us with a mean to extend our framework to consider continuous reformulations of the MI optimization subproblems. We leave this topic for future research. Thus, by exploiting the properties of the ICRF, acting as penalty terms in the individual agent’s BR problems, we prove that both proposed algorithms enjoy convergence guarantees to a feasible equilibrium solution of the MI-NEP. Specifically, in the first scenario considered the computed MI-NE is exact, while in the second one the algorithm returns an approximate MI-NE.
I-a Related literature
Mixed-integer games constitute a rather new class of strategic optimization problems. Consequently, there are not many general-purpose solution techniques available. To the best of our knowledge, this work represents a first attempt proposing proximal-like distributed algorithms for a MI game setting. The only alternative applicable algorithm for MI Nash games is the Gauss–Southwell iteration designed in [sagratella2017algorithms]. Given the practical relevance of MI games, this is rather surprising, and completely diametric to the Nash equilibrium seeking problem with continuous action sets, for which a whole arsenal of numerical solution techniques is available [dreves2011solution, mertikopoulos2017convergence], where proximal BR-based algorithms have been extensively studied in both stochastic and deterministic Nash games [Scu14, Lei:2019vj]. All these schemes, however, leverage the VI reformulation of NEP [facchinei2007generalized], and thus require the strong (or strict) monotonicity of the VI, assumptions that cannot be structurally satisfied in games with MI variables. The algorithms we develop, in particular our adaptive update of the penalty parameter regulating the proximal BR, are strongly inspired by the Gauss–Seidel iteration proposed in [facchinei2011decomposition]. As main contributions we hence show i) how a proximal-like BR-scheme with ICRF can be used to compute Nash equilibria satisfying MI restrictions, and ii) we show convergence to an approximate equilibrium even under inexact computations of BR strategies.
Ii Problem formulation and preliminaries
Ii-a Nash equilibrium problems with mixed-integer variables
Let be the set indexing the agents taking part in the noncooperative game , where each one of them controls both continuous and integer variables , belonging to a compact and nonempty, private action set . Given cost functions with and , each agent aims at solving the MI optimization problem
where . Given the strategies of the other agents, , the MI BR of agent is defined as the following set-valued mapping
Our goal is to design distributed algorithms, i.e., a sequence of steps alternating communication and computation tasks among the agents, able to drive the set of agents towards an MI-NE of the game , according to Definition 1.
(Mixed-integer -Nash equilibrium) Given some , a strategy profile is an -approximate MI-NE (or -MI-NE) of the game if, for all ,
If , then we call an exact MI-NE.
Definition 1 points out that a MI-NE of the game (if it exists) is achieved when all the agents adopt a BR strategy.
Ii-B Generalized ordinal and exact potential games
Classical existence theorems for Nash equilibrium require continuity of the agents’ cost functions and compactness as well as convexity of the feasible sets [facchinei2007generalized]. Since the NEP we are facing in (1) is nonconvex, existence of Nash equilibrium is, in principle, not guaranteed. Therefore, we will focus on a broad class of Nash games for which existence of solutions can be guaranteed under certain assumptions, namely the class of exact and generalized ordinal potential games.
(Potential game) A game is called
exact potential game [monderer1996potential] if there exists a continuous function such that, for all ,
for all and , ;
generalized ordinal potential game [facchinei2011decomposition] if, there exists a forcing function such that, for all , , and ,
We remark that any exact potential game is an ordinal potential game. By exploiting the tight relation between first-order information of the potential function and the local cost functions of the agents, it is well-known that potential functions can be employed in the construction of a suitable master problem facilitating the computation of equilibria.
[sagratella2017algorithms, Th. 2] Let be an ordinal potential function for the game . Given some , any -approximate solution of the optimization problem
yields an -approximate MI-NE of .
Standing Assumption 1
The master problem (4) admits a solution, i.e., there exists such that for all .
This assumption guarantees that the game admits at least one Nash equilibrium point in the nonconvex domain . Clearly, the solutions to the master problem (4) may not contain all possible Nash equilibria of the game [sagratella2016computing, Ex. 1].
Ii-C Integer-compatible regularization functions
Regularization techniques in NEP are based on the proximal BR function, defined as the solution map of a minimization problem in which each agents’ unilateral cost function is augmented by a quadratic penalty term. Motivated by the proximal point interpretation of MI optimization heuristics, we propose a regularization strategy of the individual agents’ cost functions viainteger-compatible regularization functions [boland2012new]. Formulating the algorithm in such a general proximal point setting will help in generalizing the current setup when fully continuous reformulations of the MI subproblems are considered. This is left for future investigation.
A continuous function is an integer-compatible regularization function (ICRF) if
for all and ;
For , we have for all ;
There exists a continuous and strictly increasing function and some such that, for all , , where denotes the norm in .
Note that any norm defined in is an ICRF. A constructive way to design ICRF is to consider decomposable penalties of the form , where is a concave, strictly increasing function, e.g.,
for some , [lucidi2010exact, boland2012new]. With these choices, amounts to an ICRF [boland2012new, Prop. 3.2]. In particular, for the special case of binary constraints, a sensible formulation of an ICRF is .
Iii Proximal-like algorithms for Mi-Nep
We now propose two MI-NE seeking algorithms in case the agents are able to compute an exact proximal BR at each iteration, or the agents adopt an inexact optimal strategy. In the former case, we rely on the fact that the MI-NEP in (1) is generalized ordinal, whereas in the latter case we require the existence of an exact potential function.
Let denote the ICRF employed by agent and let be a positive regularization parameter. We introduce the proximal augmented local cost function as a regularized version of the local cost in (1), which is given by
In accordance, the proximal BR mapping in (2) turns into
Setting allows us to recover the BR mapping as defined in (2), i.e., . By considering these two new ingredients, in the remainder of this section we design iterative and distributed schemes in which the agents update their own action sequentially, according to . We let denote the iteration counter of the process and the iterate at the beginning of round . For an arbitrary agent , we also define the population state as
This corresponds to the collective vector of strategies at the-th iteration communicated to agent when this agent has to perform an update, i.e., it computes the new strategy as a point in the MI proximal BR mapping, either exactly (Algorithm 1) or approximately (Algorithm 2). Successively, the next internal state is updated and passed to the ()-th agent. Throughout this process, note that, for all , and .
We stress that the proposed algorithms leverage the adaptive update of the regularization parameter in (7), which produces a monotonically decreasing sequence , i.e., for all [facchinei2011decomposition]. Note that the rate of decrease strongly depends on , which measures the progress the method is making in the agents’ proximal steps at the -th iteration. As it will be clear from the convergence analysis, this quantity decreases over time, thus inducing a step towards a MI-NE.
Iii-a Exact BR computation in ordinal potential games
The game in (1) is an ordinal potential game with potential function .
Thus, the Gauss–Seidel sequence of iterations in Algorithm 1 has the following convergence property. We stress that, in our framework, an accumulation point for the sequence exists in view of Standing Assumption 1.
We first show that, in case the sequence generated by Algorithm 1 admits a limit point , then the regularization parameter, adaptively updated via (7), satisfies and there exists an infinite index set such that for all . Then, we prove that is actually a MI-NE of the MI-NEP in (1).
By construction of the regularization parameter sequence defined by (7), we have for all . Then, for the sake of contradiction, assume that there exists some such that for all . In view of the updating rule at the -th iteration, for all we have
By exploiting the definition of a generalized ordinal potential game provided in Definition 2, we hence deduce
Therefore, and hence the sequence is monotonically non-increasing. By the continuity of , it follows that the full sequence is convergent to a finite value . Moreover, it follows from (9) that By definition of the forcing function, we also have
From (8), we obtain . Let be such that for all sufficiently large. By definition of an ICRF, we deduce . We recall that the function is monotonically increasing, and therefore Consequently, for all sufficiently large, it holds true that From (7), one sees that , and hence we need for all . However, this implies that at a geometric rate, thus contradicting the original hypothesis that for all .
Now, let be a convergent subsequence with accumulation point . The existence of such a convergent subsequence is guaranteed by the compactness of . By invoking the same arguments as in the first part of the proof, we obtain and for all . Next, we show by contradiction that the accumulation point coincides with an MI-NE of (1) with generalized ordinal potential. To this end, let us suppose that there exists an agent that can further minimize its cost function, i.e., for some , By relying on the update rule in Algorithm 1, we obtain
Now, passing to the limit and using the fact that ,
which denotes a contradiction, thus concluding the proof.
In contrast to the continuous case where a constant penalty parameter can be used (see, e.g., [facchinei2011decomposition, Th. 4.3]), in a MI setting, a vanishing regularization parameter is required. In addition, note that Algorithm 1 assumes that agents can compute an exact BR of the proximal augmented local cost function in (5). This requires that, at every single iteration, each agent solves a MI nonlinear optimization problem to optimality. Without additional structural assumptions (e.g., individual convexity), this could render the method inefficient in practice. This reason motivates us to investigate a variant of Algorithm 1 involving inexact computation of a point lying in the perturbed MI BR mapping (6) of each agent.
Iii-B Inexact BR computation in exact potential games
We now consider the case in which the MI-NEP in (1) admits an exact potential function. In this case, we can prove convergence even if the agents implement only approximate BRs at every iteration, according to the following definition:
(-proximal BR) Given any and tolerance , is an -optimal response to if
For each agent , we hence define as the set of -optimal responses, given some collective vector of strategies . For the theoretical developments of this subsection, we then make the following assumption.
The game is an exact potential game with potential function
Algorithm 2 summarizes the main steps of the resulting distributed, Gauss–Seidel type sequence of iterations. For the considered instance, after choosing an initial strategy and a sequence of penalty parameters , the preliminary step requires to further define a certain error tolerance in computing a -optimal response. To this end, we let be a given sequence of positive numbers such that for some . In Algorithm 2 we stick to the sequential update architecture, but instead of requiring that agents pursue an exact BR, we allow the updating agent to choose an inexact BR, , only in case the inexact BR computed at the previous step, i.e., the one obtained by considering , and , does not belong to . The updated strategy is then send to the agent down the line, and the procedure repeats as long as some stopping criterion is not met.
The proof makes use of similar arguments as the one of Theorem 1. As a starting point, by definition of the update , we have
Since in this case is an exact potential function, we have
where the last inequality uses the non-negativity of the ICRF. By summing from to , we hence obtain
In view of [franci2022convergence, Lemma 3.4], it follows that exists and is finite. Therefore, since is compact, the sequence admits a convergent subsequence with indices contained in some countable infinite set , which has a limit point . Therefore, in view of the continuity of the potential function, we have that and . Thus, for all ,
By definition of the potential function, it follows that
Again, by exploiting the property of the ICRF in Definition 3.i), we obtain for every . It then follows Consequently, we deduce from the first part of the proof of Theorem 1 that also . We now claim that is an -approximate Nash equilibrium and argue by contradiction. Suppose there exists an agent such that, for some , Resorting the definition of the update mechanism yields
Then, passing to the limit and exploiting the fact that the regularization parameter tends to zero, i.e., , and that , we finally obtain
This represents a contradiction and concludes the proof.
Note that Algorithm 2 also covers the case in which the error sequence
is not forced by the designer before implementing the procedure, but it naturally arises from, e.g., a learning process. For example, consider a setting in which the agents, rather than being mere computational entities, are endowed (or behave according) with typical parametric/nonparametric learning procedures, such as Gaussian processes or neural network. As long as, Algorithm 2 returns an -approximate MI-NE. If the approximation error vanishes with the iteration index, instead, then Algorithm 2 produces a convergent sequence of strategy profiles to an exact MI-NE. This assumption is not so stringent as it may seem. In the example considered above, i.e., agents endowed with learning procedures, asymptotic consistency bounds on the approximation error can be exploited directly [simonetto2021personalized].
Iv Numerical Experiments
|Upper bound (discrete)|
|Upper bound (continuous)|
|Initial goods vector|
|Error tolerance sequence|
We now test our theoretical findings on a numerical instance of a classic Cournot oligopoly model [ConKluKraw04, Gabriel:2013us, sagratella2016computing].
Specifically, we consider a market in which firms produce goods each in order to maximize their profits. Here, the first products, , are indivisible, while the other , , are modeled with continuous variables and hence . Thus, each firm aims at solving the following MI quadratic program
where the main parameters are described in Table I. Also, we define matrices , typically related with the costumers’ inverse demand, according to the procedure described in [sagratella2016computing, §4] for all , and then we impose . This degree of symmetry ensures the existence of a potential function for the MI-NEP in (10), see, e.g., [cenedese2019charging], which has the form
with and . In case the -entry of matrix is nonnegative, the -th product of the firm is a substitute for the -th product of firm . On the other hand, if that entry is negative then the -th product of the firm is a complement for the -th product of firm . This framework resembles the 2-groups partitionable class in [sagratella2016computing]. However, the numerical instances considered here, albeit more general, do not include such class of problems. As an ICRF for each agent, we adopted applied componentwise ( denotes, indeed, the -th element of ).
The numerical results reported in Figures 1–4 are obtained in Matlab by using Gurobi [gurobi] as a solver on a laptop with a Quad-Core Intel Core i5 2.4 GHz CPU and 8 Gb RAM. Specifically, we generate random instances of the considered MI-NEP in (10) and test the behavior of Algorithm 1 and 2, where each agent takes, on average, s to compute a BR strategy. Thus, while Fig. 1 shows the averaged convergence behavior of the sequence of MI strategy profiles generated by Algorithm 1, which actually converge in less than iterations, Fig. 2 reports the distance between the optimal value of the master problem in (4) and over the iteration index . In both cases, is obtained as an optimal solution to (4), which amounts to solving an MI quadratic program in view of the structure of . In Fig. 3, instead, it is shown the averaged behavior of the sequence of sub-optimal MI strategy profiles generated by Algorithm 2, which converges to an -approximate MI-NE of the Cournot model in (10). Note that the if-condition in the procedure generates a typical stepped behaviour and the convergence, in general, requires few more iterations. Finally, Fig. 4 illustrates the averaged evolution of , which corresponds to the approximation error actually made in computing an inexact BR by the agents, measured by the solver. The latter is upper bounded by the error sequence reported in Table I.
We have presented two proximal-like equilibrium seeking algorithms for NEP with mixed-integer variables admitting either generalized ordinal or exact potential functions. Exploiting the properties of integer-compatible regularization functions used as penalty terms in the agents’ cost functions is key to prove convergence both in case the agents pursue an exact optimal strategy or an approximated one.
Future research directions include, but are not limited to, the extension of the proposed algorithms to generalized MI-NEP, as well as developing their stochastic counterparts. There are also interesting computational questions to investigate. From the proximal-point interpretation of the feasibility pump, large penalty parameters enforce integer restrictions. One could attach this approach to the proposed Nash equilibrium seeking algorithms to design a two-layer procedure with large penalties at the beginning of the scheme.