I Introduction
Generalized Nash equilibrium problems (GNEPs) have been widely studied in literature. A number of results are present concerning algorithms and methodologies to find an equilibrium, see [1, 2, 3] and the references therein. The reason for this interest is related with the potential applications ranging from economics to engineering via operation research [4, 5]. In a GNEP, each agent aims at minimizing his own cost function, under some feasibility constraints. The main feature is that both the cost function and the constraints depend on the strategy chosen by the other agents. Due to the presence of shared constraints, the search for a generalized Nash equilibrium is a challenging task in general.
In the deterministic case, several algorithms are available for finding a GNE, both distributed and semidecentralized [1, 6, 7, 8]. Among the available methods for GNEP, an elegant approach is to recast the problem as a monotone inclusion through the use of interdependent Karush–Kuhn–Tucker (KKT) conditions. The resulting Lagrangian formulation allows one to seek equilibrium points as the solutions of an associated variational inequality (VI) that is usually more tractable and the literature is wider than the GNEPs literature [9, 3] (and reference therein). Equilibria obtained in this way are called variational equilibria.
Given a monotone inclusion problem, a powerful procedure is to use an operator splitting scheme to recast the problem as the search for the zeros of a sum of two monotone operators. One of the simplest schemes is the forward–backward (FB) operator splitting [9]. Convergence of such a scheme is guaranteed if the pseudogradient mapping of the game is strongly monotone. Since we study a gametheoretic problem, the algorithm should be distributed, in the sense that each agent should only know his local cost function and his local constraints. Unfortunately, the FB splitting does not lead to a distributed algorithm when applied to a GNEP. In the deterministic case, preconditioning has been proposed in [6, 7] to overcome this problem.
To the best of our knowledge, the preconditioned FB operator splitting has not been exploited for stochastic GNEPs (SGNEPs) but it would be very relevant from an algorithmic perspective.
The literature on stochastic Nash equilibrium problems is less rich than its deterministic counterpart [10, 11, 12]. Nevertheless, several problems of interest cannot be modelled without uncertainty. Among others, consider transportation systems, where one of the source of uncertainty is due to the drivers perception of travel time [13]; electricity markets where generators produce energy without knowing the actual demand [14]; or, more generally, problems that can be modelled as networked Cournot games with market capacity constraints where the demand is uncertain [15, 16]
. Mathematically speaking, a SGNEP is a GNEP where the cost functions of the agents are expected–value functions and the distribution of the random variables is unknown. Existence (and uniqueness) of equilibria is studied in
[11] but the study of convergence algorithms is not fully developed yet [12, 10].One possible motivation for that is the presence of the expected–value cost functions. When the probability distribution of the random variable is known, the expected value formulation can be solved with a technique for deterministic VI. However, the pseudo gradient is usually not directly accessible, for instance due to the intractability of computing the expected value. For this reason, in many cases, the solution of a stochastic VI relies on samples of the random variable. There are, in fact, two main methodologies available: sample average approximation (SAA) and stochastic approximation (SA). The SAA approach replaces the expected value with the average over a huge number of samples of the random variable. This approach is practical in Monte Carlo Simulations or machine learning, when there is a huge number of data available
[17, 18]. In the SA approach, the decision maker sample only one realization of the random variable. This approach is less computationally expensive and more appropriate in a decentralized framework but, in general, it requires stronger assumptions of the problem data [19, 20, 21].In this paper, we formulate a distributed stochastic FB algorithm through preconditioning and prove its consequent convergence. The associated SVI is obtained in the same way as the deterministic case, i.e., via augmented KKT inclusions. Among the possible approaches for solving an SVI (for instance [19, 18, 20]), we propose a damped forward–backward scheme [22] and prove convergence under proper assumptions, i.e., strong monotonicity of the pseudo gradient game mapping and measurability of the random variable.
Notation: denotes the standard inner product and represents the associated euclidean norm. Given a symmetric positive definite matrix , denote the induced inner product, . The associated induced norm, , is defined as . We indicate that a matrix is positive definite, i.e., , with . indicates the Kronecker product between matrices and .
indicates the vector with
entries all equal to . Given vectors . is the resolvent of the operator and indicates the identity operator. For a closed set the mapping denotes the projection onto , i.e., . is the indicator function of the set C, that is, if and otherwise. The setvalued mapping denotes the normal cone operator for the the set , i.e., if otherwise. For a function , indicates the domain and the subdifferential is the setvalued mapping .Ii Generalized Nash equilibrium problem
Iia Equilibrium Problem setup
We consider a set of noncooperative agents, each of them deciding on its strategy from its local decision set with the aim of minimizing its local cost function. Let be the decisions of all the agents except for and define . We consider that there is some uncertainty in the cost function, expressed through the random variable , where is the associated probability space. Then, for each agent , we define the cost function as
(1) 
for some measurable function . We note that the cost function depends on the local variable , the collective decision of the other agents and the random variable . represents the mathematical expectation with respect to the distribution of the random variable ^{1}^{1}1For brevity, we use instead of , where , and instead of .. We assume that is well defined for all the feasible . Furthermore, we consider a game with affine shared constraints, . Thus we denote the feasible decision set of each agent by the setvalued mapping
(2) 
where and . The matrix defines how agent is involved in the coupling constraints. The collective feasible set can be then written as
(3) 
where and . Note that there is no uncertainty in the constraints.
Next, we postulate standard assumptions for the cost functions and the constraints set.
Standing Assumption 1
For each and the function is convex and continuously differentiable.
Standing Assumption 2
For each the set is nonempty, compact and convex. The set satisfies Slater’s constraint qualification.
Formally, the aim of each agent , given the decision variables of the other agents , is to choose a strategy , that solves its local optimization problem, i.e.,
(4) 
From a gametheoretic perspective, we aim at computing a stochastic generalized Nash equilibrium (SGNE), i.e., a collective strategy such that, for all :
In other words, a SGNE is a set of strategies where no agent can decrease its objective function by unilaterally changing its strategy. To guarantee existence of a stochastic equilibrium, let us introduce further assumptions on the local cost functions .
Standing Assumption 3
For each and , the function is convex, Lipschitz continuous, and continuously differentiable. The function is measurable and for each , the Lipschitz constant is integrable in
While, under Standing Assumptions 1, 2 and 3, existence of a GNE of the game is guaranteed by [11, §3.1], uniqueness does not hold in general [11, §3.2].
Within all possible Nash equilibria, we focus on those that corresponds to the solution set of an appropriate (stochastic) variational inequality. Let
(5) 
Formally, the stochastic variational inequality problem is the problem of finding such that
(6) 
with as in (5). We note that we can exchange the expected value and the gradient in (5) thanks to Standing Assumption 3 [11, Lem. 3.4]. We also note that any solution of is a generalized Nash equilibrium of the game in (4) while the opposite does not hold in general. In fact, a game may have a Nash equilibrium while the corresponding VI may have no solution [23, Prop. 12.7].
A sufficient condition for the variational problem in (6) to have a solution is that is strongly monotone [3, Th. 2.3.3], [11, Lemma 3.3], as we postulate next.
Standing Assumption 4
is strongly monotone, i.e., for
and Lipschitz continuous, i.e.,
IiB Operatortheoretic characterization
In this subsection, we recast the GNEP into a monotone inclusion, namely, the problem of finding a zero of a setvalued monotone operator.
First, we characterize the SGNE of the game in terms of the Karush–Kuhn–Tucker conditions for the coupled optimization problems in (4). For each agent , let us introduce the Lagrangian function as
where is the dual variable associated with the coupling constraints. We recall that the set of strategies is a SGNE if and only if the following KKT conditions are satisfied [1, Th. 4.6]:
(7) 
Similarly, we can use the KKT conditions to characterize a variational GNEP (vGNEP), studying the Lagrangian function associated to the SVI. Since is a solution of if and only if
the associated KKT optimality conditions are
(8) 
We note that (8) can be written in compact form as
where is a setvalued mapping. It follows that the vGNE correspond to the zeros of the mapping . The next proposition shows the relation between SGNE and variational equilibria.
Lemma 1
Essentially, Lemma 1 says that variational equilibria are those such that the shared constraints have the same dual variable for all the agents.
Iii Preconditioned forward–backward generalized Nash equilibrium seeking
In this section, we propose a distributed forward–backward algorithm for finding variational equilibria of the game in (4).
We suppose that each player only knows its local data, i.e., , and . Moreover, each player is able to compute, given , (or an approximation, as exploited later in the section). We assume therefore that each agent has access to the strategies that affect its local gradient (full information setup). These information are collected, for each agent , in the set , that is, the set of agents whose strategy explicitly influences .
Since, by Lemma 1, the configuration of the vGNE requires consensus of the dual variables, we introduce an auxiliary variable for all . The role of is to enforce consensus but it does not affect the property of the operators and of the algorithm. More details on this variable are given in Section IVA. The auxiliary variable and a local copy of the dual variable are shared through the dual variables graph, . The set of edges represents the exchange of the private information on the dual variables: if player can receive from player . The set of neighbours of in is given by [7, 10].
Standing Assumption 5
The dual variable graph is undirected and connected.
The weighted adjacency matrix of the dual variables graph is indicated with . Let be the Laplacian matrix associated to the adjacency matrix , where is the diagonal matrix of the degrees and . It follows from Standing Assumption 5 that and the associated Laplacian are both symmetric, i.e., and . Moreover, Standing Assumption 5 is fundamental to guarantee that the coupling constraints are satisfied since agents have to reach consensus of the dual variables.
Next we present a distributed forward–backward algorithm with damping for solving the SGNEP in (4) (Algorithm 1). For each agent , the variables , and denote the local variables , and at the iteration time while , and are the step sizes.
Initialization: and
Iteration : Agent
(1): Receives for all for then updates:
(2): Receives for all then updates:
Since the distribution of the random variable is unknown, in the algorithm we have replaced the expected value with a sample average approximation (SAA). We assume to have access to a pool of i.i.d. sample realizations of the random variable collected, for all and for each agent , in the vectors . At each time , we have
(9) 
where is the batch size. As usual in SAA, we assume that the batch size increases over time according to the following lower bound.
Assumption 1
There exist such that, for all
Next, we postulate that the distance of the expected value and its approximation,
(10) 
has a vanishing second moment for all
.Assumption 2
For all and , the stochastic error is such that
Such a bound for the stochastic error can be obtained as a consequence of some milder assumptions that lay outside the scope of this work. We refer to [17, Lem. 4.2], [18, Lem. 3.12] for more details.
Furthermore, we assume that the step sizes are small enough as formalized next.
Standing Assumption 6
Let . Let and define . Let
(11) 
where and are respectively the strongly monotone and the Lipschitz constants as in Standing Assumption 4. The parameter is positive and . The step sizes , and satisfy, for any agent
(12)  
where indicates the entry of the matrix .
An insight on these conditions is given in Section IV.
We are now ready to state our convergence result.
Theorem 1
See Section IV.
Iv Convergence analysis
Iva Preconditioned forward backward operator splitting
We first note that the mapping can be written as the sum of two operators. Specifically, where
(13)  
We note that finding a solution of the variational SGNEP translates in finding .
Let be the Laplacian matrix of and set . To impose consensus on the dual variables, the authors in [7] proposed the Laplacian constraint . Then, to preserve monotonicity one can augment the two operators and introducing the auxiliary variable . Define and and similarly let us define of suitable dimensions. Then, we introduce
(14)  
From now on, let us indicate . The operators and in (14) have the following properties.
Lemma 2
The operator is maximally monotone and is cocoercive with as in (11).
It follows from [7, Lem. 5]. Notice that this result explains the choice of the parameter in Assumption 6. The following theorem shows that the points provide a variational equilibrium of the SGNEP.
Lemma 3
It follows from [7, Th. 2].
Unfortunately, the operator
is monotone but not cocoercive, due to the skew symmetric matrix, hence, we cannot directly apply the FB operator splitting
[9, §26.5]. To overcome this issue, the authors in [7] introduced a preconditioning matrix . Thanks to , the zeros of the mapping correspond to the fixed point of a specific operator that depends on the operators and as exploited in [6, 7]. Indeed, it holds that, for any matrix , if and only if,(15) 
where is the resolvent of and represent the backward step and is the (stochastic) forward step. In the deterministic case, convergence of the FB splitting is guaranteed by [9, Section 26.5]. In the stochastic case, the FB algorithm, as it is in (15), is known to converge for strongly monotone mappings [12]. For this reason, we focus on the following damped FB algorithm
(16) 
that converges with cocoercivity of [22]. We show that this is true in Lemma 4. First, we show that (16) is equivalent to Algorithm 1.
Note that, if we write the resolvent explicitly, the first step of Equation (16) can be rewritten as
(17) 
The matrix should be symmetric, positive definite and such that is easy to be computed [6].
We define and similarly and of suitable dimensions. Let
(18) 
and suppose that the parameters , and satisfy (12) in Standing Assumption 6.
We can obtain conditions (12) imposing to be diagonally dominant. This, in combination with the fact that it is symmetric, implies that is also positive definite. Then, the operators and satisfy the following properties under the induced norm .
Lemma 4
Given and in (14), the following statements hold:

is cocoercive;

is maximally monotone.
It follows from [7, Lem. 7].
IvB Stochastic sample average approximation
Since the expected value can be hard to compute, we need to take an approximation. At this stage, it is not important if we use sample average or stochastic approximation, therefore, in what follows, we replace with
where is an approximation of the expected value mapping in (5) given a vector sample of the random variable . Then, (17) can be rewritten as
(19)  
By expanding (19), we obtain the first steps of Algorithm 1. The damping part is distributed and it does not need preconditioning.
We note that, thanks to the fact that is lower block triangular, the iterations of Algorithm 1 are sequential, that is, use the last update and of the agents strategies and of the auxiliary variable respectively.
We are now ready to prove our convergence result.
[Proof of Theorem 1] The iterations of Algorithm 1 are obtained by expanding (16), solving for , and and adding the damping iteration. Therefore, Algorithm 1 is the FB iteration with damping as in (16). The convergence of the sequence to a vGNE of the game in (4) then follows by [22, Th. 3.2] and Lemma 3 since is cocoercive by Lemma 2.
IvC Discussion
The original result in [22] shows convergence for cocoercive and uniformly monotone operators. Moreover, they provide the proof for a generic approximation of the random mapping . We note that fixing the type of approximation (SAA or SA) can be important for weakening the assumptions. Indeed, using the SAA scheme, cocoercivity is enough for convergence without further monotonicity assumptions. On the other hand, the SA approach requires cocoercivity and strict monotonicity.
Unfortunately, the mapping is not strictly monotone (due to the presence of the Laplacian matrix), therefore we use SAA as in (9).
Concerning the stochastic error in (10), the assumption in [22]
is similar to the socalled ”variance reduction”. Such an assumption is fundamental in the SAA scheme, but it can be avoided in the SA method
[19]. Indeed, in the SAA scheme, taking the average over a huge number of samples helps controlling the stochastic error and therefore finding a solution [18, 25]. For this reason, in our case the stepsize can be taken constant. For the SA scheme instead, the error is controlled in combination with the parameters involved, for instance, using a vanishing stepsize (possibly square summable) [26] or using smoothing techniques (as a Tikhonov regularization) [19]. In both cases, the damping parameter can be taken constant.V Case study and numerical simulations
As an example, we borrow an electricity market problem from [12] which can also be casted as a network Cournot game with markets capacity constraints [7, 10].
Consider a set of generators (companies) that operate over a set of locations (markets). The random variable represent the demand uncertainty. Each generator decides the quantity of product to deliver in the markets it is connected with. Each company has a local cost function related to the production of electricity. The cost function is deterministic as we suppose that the generators are able to compute the cost for production without uncertainty. On the other hand, each market has a bounded capacity so that the collective constraints are given by where and specifies which market company participates in. Each location has a price, collected in . The uncertainty variable appears in this functional since this is the part related to the demand. is supposed to be a linear function. The cost function of each agent is then given by
Clearly, if is strongly convex with Lipschitz continuous gradient and the prices are linear, the pseudo gradient of is strongly monotone and Standing Assumption 1 is satisfied.
Va Numerical example
As a numerical setting, we consider a set of 20 companies and 7 markets, similarly to [7, 10]. Each company has has a local constraint and each component of is randomly drawn from . In terms of electricity market, this can be seen as the capacity limit of generator . Each market has a maximal capacity randomly drawn from . The local cost function of the generator is , where indicates the component of . is randomly drawn from , and each component of is randomly drawn from . Notice that is strongly convex with Lipschitz continuous gradient.
The price is taken as a linear function where each component of is randomly drawn from . The uncertainty appears in the quantities that concern the total supply for each market. The entries of
are taken with a normal distribution with mean
and finite variance. As in [7] we suppose that the dual variables graph is a cycle graph with the addiction of the edges and . Recall that the cost function of agent is influenced by the variables of the companies selling in the same market. This information can be retrieved from the graph in Fig. 3.The step sizes are the same for all agents : , , . The initial point is randomly chosen within its local feasible set, and the initials dual variables and are set to zero. The damping parameter is taken to be to compare the results.
The plots in Fig. 1 and 2 show respectively the following performance indices: and . The first one indicates the convergence to a solution , while the second shows convergence of the dual variables to consensus. As one can see from the plots, the higher the averaging parameter the faster is the convergence. The fact that the solution oscillates is due to the approximation but we want to focus on the fact that the distance from the solution is decreasing. Moreover, it is interesting that for the algorithm is indeed a FB algorithm without damping and it converges with constant stepsize.
Vi Conclusion
The preconditioned forward–backward operator splitting is applicable to stochastic generalized Nash equilibrium problems to design distributed equilibrium seeking algorithms. Since the expected value is hard to compute in general, the sample average approximation can be used to ensure convergence almost surely.
Our simulations show that the damping step may be unnecessary for convergence. We will investigate this case as future research.
References
 [1] F. Facchinei and C. Kanzow, “Generalized Nash equilibrium problems,” Annals of Operations Research, vol. 175, no. 1, pp. 177–211, 2010.
 [2] J. B. Rosen, “Existence and uniqueness of equilibrium points for concave nperson games,” 1964.
 [3] F. Facchinei and J.S. Pang, Finitedimensional variational inequalities and complementarity problems. Springer Science & Business Media, 2007.
 [4] L. Pavel, “An extension of duality to a gametheoretic framework,” Automatica, vol. 43, no. 2, pp. 226–237, 2007.
 [5] A. A. Kulkarni and U. V. Shanbhag, “On the variational equilibrium as a refinement of the generalized nash equilibrium,” Automatica, vol. 48, no. 1, pp. 45–55, 2012.
 [6] G. Belgioioso and S. Grammatico, “Projectedgradient algorithms for generalized equilibrium seeking in aggregative games are preconditioned forwardbackward methods,” in 2018 European Control Conference (ECC). IEEE, 2018, pp. 2188–2193.
 [7] P. Yi and L. Pavel, “An operator splitting approach for distributed generalized Nash equilibria computation,” Automatica, vol. 102, pp. 111–121, 2019.
 [8] G. Belgioioso and S. Grammatico, “Semidecentralized Nash equilibrium seeking in aggregative games with separable coupling constraints and nondifferentiable cost functions,” IEEE control systems letters, vol. 1, no. 2, pp. 400–405, 2017.
 [9] H. H. Bauschke, P. L. Combettes et al., Convex analysis and monotone operator theory in Hilbert spaces. Springer, 2011, vol. 408.
 [10] C.K. Yu, M. Van Der Schaar, and A. H. Sayed, “Distributed learning for stochastic generalized Nash equilibrium problems,” IEEE Transactions on Signal Processing, vol. 65, no. 15, pp. 3893–3908, 2017.
 [11] U. Ravat and U. V. Shanbhag, “On the characterization of solution sets of smooth and nonsmooth convex stochastic Nash games,” SIAM Journal on Optimization, vol. 21, no. 3, pp. 1168–1199, 2011.
 [12] H. Xu and D. Zhang, “Stochastic Nash equilibrium problems: sample average approximation and applications,” Computational Optimization and Applications, vol. 55, no. 3, pp. 597–645, 2013.
 [13] D. Watling, “User equilibrium traffic network assignment with stochastic travel times and late arrival penalty,” European Journal of Operational Research, vol. 175, no. 3, pp. 1539–1556, 2006.
 [14] R. Henrion and W. Römisch, “On mstationary points for a stochastic equilibrium problem under equilibrium constraints in electricity spot market modeling,” Applications of Mathematics, vol. 52, no. 6, pp. 473–494, 2007.
 [15] V. DeMiguel and H. Xu, “A stochastic multipleleader Stackelberg model: analysis, computation, and application,” Operations Research, vol. 57, no. 5, pp. 1220–1235, 2009.
 [16] I. Abada, S. Gabriel, V. Briat, and O. Massol, “A generalized Nash–Cournot model for the northwestern european natural gas markets with a fuel substitution demand function: The gammes model,” Networks and Spatial Economics, vol. 13, no. 1, pp. 1–42, 2013.
 [17] R. I. Bot, P. Mertikopoulos, M. Staudigl, and P. T. Vuong, “Forwardbackwardforward methods with variance reduction for stochastic variational inequalities,” arXiv preprint arXiv:1902.03355, 2019.
 [18] A. Iusem, A. Jofré, R. I. Oliveira, and P. Thompson, “Extragradient method with variance reduction for stochastic variational inequalities,” SIAM Journal on Optimization, vol. 27, no. 2, pp. 686–724, 2017.
 [19] J. Koshal, A. Nedic, and U. V. Shanbhag, “Regularized iterative stochastic approximation methods for stochastic variational inequality problems,” IEEE Transactions on Automatic Control, vol. 58, no. 3, pp. 594–609, 2013.
 [20] F. Yousefian, A. Nedić, and U. V. Shanbhag, “On smoothing, regularization, and averaging in stochastic approximation methods for stochastic variational inequality problems,” Mathematical Programming, vol. 165, no. 1, pp. 391–431, 2017.
 [21] ——, “Optimal robust smoothing extragradient algorithms for stochastic variational inequality problems,” in 53rd IEEE Conference on Decision and Control. IEEE, 2014, pp. 5831–5836.
 [22] L. Rosasco, S. Villa, and B. C. Vũ, “Stochastic forward–backward splitting for monotone inclusions,” Journal of Optimization Theory and Applications, vol. 169, no. 2, pp. 388–406, 2016.
 [23] D. P. Palomar and Y. C. Eldar, Convex optimization in signal processing and communications. Cambridge university press, 2010.
 [24] F. Facchinei, A. Fischer, and V. Piccialli, “On generalized Nash games and variational inequalities,” Operations Research Letters, vol. 35, no. 2, pp. 159–164, 2007.
 [25] A. N. Iusem, A. Jofré, R. I. Oliveira, and P. Thompson, “Variancebased extragradient methods with line search for stochastic variational inequalities,” SIAM Journal on Optimization, vol. 29, no. 1, pp. 175–206, 2019.
 [26] A. Kannan and U. V. Shanbhag, “The pseudomonotone stochastic variational inequality problem: Analytical statements and stochastic extragradient schemes,” in 2014 American Control Conference. IEEE, 2014, pp. 2930–2935.