Probabilistic sensitivity of Nash equilibria in multi-agent games: a wait-and-judge approach

03/25/2019 ∙ by Filiberto Fele, et al. ∙ University of Oxford 0

Motivated by electric vehicle charging control problems, we consider multi-agent noncooperative games where, following a data driven paradigm, unmodeled externalities acting on the players' objective functions are represented by means of scenarios. Building upon recent developments in scenario-based optimization, based on the evaluation of the computed solution, we accompany the Nash equilibria of the uncertain game with an a posteriori probabilistic robustness certificate, providing confidence on the probability that the computed solution remains unaffected when a new uncertainty realisation is encountered. The latter constitutes, to the best of our knowledge, the first application of the so-called scenario approach to multi-agent Nash equilibrium problems. The efficacy of our approach is demonstrated in simulation for the charging coordination of an electric vehicle fleet.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 9

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

The role of game theory in the management and control of cyberphysical systems 

[1] is now consolidated, counting on a significant research effort in the last decades [2]. The significance of the concept of Nash equilibrium (NE) in this context comes from the ability to predict the resolution of conflicts between selfish, rational agents [3]. NE problem formulations have become popular candidates for distributed and decentralized control architectures, as they naturally lend themselves to price-based implementations, easily fitting in the operational requirements of liberalized resource markets [4]. Applications in smart grid, transportation, and IT are envisioned among the possible beneficiaries [5, 6, 7, 8, 9]. In connection with public infrastructures, a large number of studies followed the seminal work of Wardrop [10], focusing on the relation between the achieved equilibria and the social welfare; among recent results, we mention [11, 12, 7, 13, 14].

According to the definition of NE [15], the aforementioned studies address deterministic settings where the systems conditions (e.g., prices in an incentive scheme) are assumed to be known in advance. Such an assumption is not reflected in realistic scenarios, where the relevant information may be subject to significant uncertainties: this concept becomes paramount in the framework of cyberphysical systems, where the heterogeneity and complexity of the interactions between the constituting parts heavily hinders the task of predicting the system behaviour [16, 17, 18]. Indeed, since the complete knowledge of the game was questioned by [19], uncertainty has been widely addressed in noncooperative games, by adopting stochastic or robust (worst-case) approaches. In the first case, both chance-constrained (risk-averse) [20, 21] or expected payoff criteria [22, 23, 24, 25, 26]

have been considered. Due to their approach, these studies pivot around given hypotheses on the underlying probability distribution of the uncertainty. In the second case, results build upon robust control theory 

[3, 27], and as such they hold for given characterizations of the uncertainty set. We refer the reader to [28, 16] for an extended literature review.

We depart by previous robust game-theoretical literature by characterizing uncertainty on the objective function through a data-driven methodology. In particular, our results are founded on recent developments in the scenario optimization [29]. In this framework, uncertainty is characterized by a finite set of scenarios [30] obtainable, e.g., from historical data [31]. One essential feature of the proposed approach is to allow the definition of probabilistic performance certificates for the solution—without requiring any additional knowledge on the uncertainty except for the available set of scenarios used in the solution computation [32]. Notably, the results presented in [33] constitute a breakthrough in this sense: under mild convexity assumptions, these provide tight a priori bounds on the probability of constraint violation of a scenario program solution. However, in the present context, these results can be quite conservative in general, as they depend on the dimension of the decision space: in the considered setting, a considerable overhead is due to the size of the scenario set. Moreover, in multi-agent games, this translates into a number of decision variables increasing with the number of players. Nonetheless, the theory presented in [29] allows to circumvent this problem by means of an a posteriori analysis of the solution: in this case, the only determining factor in the robustness performance is the cardinality of the subset of samples constraining the solution.

In this paper, we contribute to the state-of-the-art of NE problem solution algorithms in the following aspects:

• We address the robustness of strategic equilibria in multi-agent settings, characterized as NEs of noncooperative minmax games. We propose a partially-cooperative approach to achieve robustness to uncertainty globally affecting the players’ objective functions. We note that although this approach shares theoretical grounds with [34], we depart from it by admitting the presence of multiple maximisers. Besides expanding the scope of the proposed game model, this is pivotal to the application of the adopted scenario-based framework.

• We propose a decentralized solution approach: relying on the action of a coordinator, no communication between the agents is contemplated, thus relaxing the communicational requirements of related results (e.g., [25]). To circumvent the nondifferentiabilty of the game, hindering the application of conventional decentralised techniques, we follow the methodology in [35] and resort to an augmented game. Major advantages of this approach are the solution computation enjoys the same convergence properties of state-of-the-art decentralized algorithms for monotone games [8], without imposing strong monotonicity assumptions that may not be satisifed for games of practical interest, and the uncertain component of the objective function can fall within the broader class of weakly convex functions [35]. We note that, as an alternative, the minmax problem can be cast into the class of generalized Nash games by an epigraph reformulation: however, decentralized solution algorithms for this class of games impose generally restrictive assumptions, e.g., affine coupling constraints [9, 7, 11], which do not necessarily fit the format of the resulting epigraphic constraint.

• We adopt a data-driven paradigm and represent the unmodelled uncertainty by means of scenarios [30]. We then accompany the solution with prescribed robustness levels, while dropping the standard requirements on the knowledge of the uncertainty [29]. To the best of our knowledge, this work sets the first step in the formalization of scenario-based robustness in noncooperative games. In doing this, we overcome significant limitations of stochastic approaches, namely the computational cost of Monte Carlo simulations, and the absence of guarantees for the solution.

The rest of the paper is structured as follows. In Section II we introduce a motivating example, and then provide a more general formulation of the addressed noncooperative minmax problem. We present the main result of the paper in Section III. Section IV introduces necessary ingredients for the proof, described in Section V along with a methodology for the enumeration of the uncertainty samples supporting the NE. Finally, numerical results are illustrated on an electric-vehicle charging coordination problem.

Ii Problem statement

Ii-a Motivating example: coordinated EV charging

Let the set designate the finite population of EV agents. The demand profile of the EVs (henceforth the strategy) of agent must fulfil individual constraints described by the set . We denote by , with , the collection of strategies relative to all the agents, where .

Each agent determines its strategy as a response to a pricing signal received from a coordinator, and synthesized as a function of the global strategy ; such a signal can represent, for instance, the variable unit cost of a consumed resource. The possible influence of uncertainty (e.g., externalities acting on the energy spot market) on the price is modelled by the parameter . In particular, we assume that a nominal and an uncertain component can be distinguished: thus, given and , the utility of agent is

where and respectively represent the nominal and the uncertain component of the price.

Ii-B Noncooperative minmax game

To cope with the uncertain component of the price, we consider the minimisation of its average impact on the value of , for all . In other words, each agent computes

(1)

where is defined as

(2)

We can write (2) in a more general form as

(3)

where expresses local objectives as a function of local and (possibly) global strategies whereas, for some common objective , evaluates the worst-case realization over the uncertainty set . As a consequence, the solution of (1) entails a certain level of cooperation between the agents, which can be facilitated in settings where their interests are (at least to some degree) aligned (e.g., electric vehicles participating in the same aggregation plan, or belonging to a centrally managed fleet).

The setting described above is modelled by a noncooperative minmax game, defined by the tuple , where is the set of players, , are respectively the strategy set and the cost function for each player , and is the uncertainty set.

We consider the following blanket assumptions:

Assumption 1

For every fixed and , the function is convex and continuously differentiable for all . Furthermore, the local constraint set is nonempty, compact and convex for all , and is bounded.

Assumption 2

Let be an open convex set such that . The function is twice differentiable on and, for each , is twice differentiable on .

We note that for to be convex it is sufficient that is weakly convex on and is weakly convex on , with constant and respectively, such that ; for a definition of weak convexity see (17)–(18).

Iii Robust noncooperative game

The solution of the robust game requires each agent to solve a minmax optimisation problem. Even in cases where this is numerically tractable (for example when the uncertainty of the value of is well-behaved), the attainment of a robust solution still requires the full characterisation of the set .

Iii-a Scenario-based approach

Here we suppose instead that, however large, the available set of data (e.g., historical data) only allows to achieve a partial characterisation of the uncertainty. Following a data-based approach, it is still possible to consider a “sampled” version of the problem, using a finite set of uncertainty realizations . This results in the finite minmax game , where

(4)

and approximates in (3). We consider the following solution concept for :

Definition 3 (Nash equilibrium)

Let denote the set of Nash equilibria of , defined as

(5)

We point out that

is a random variable, subject to the particular extraction of

elements from . As a consequence, the composition of the set is subject to the given multiextraction . For notational simplicity, we will not make this dependence explicit in the rest of the document.

Iii-B A posteriori robustness certification

A question that naturally arises is how robust a solution is against unknown scenarios—i.e., scenarios not included in the available dataset. In the remainder of this section we show that a formal answer to this question can be provided through a sensitivity analysis of the solution over the set of samples

used for its derivation. Most importantly, this estimate can be performed

without requiring any further knowledge of the uncertainty except for the aforementioned samples. The main result—relying on the developments in scenario-based optimization recently presented in [29]—is a robustness certificate that quantifies the probability for a NE of the approximated game to remain an equilibrium when a new scenario of uncertainty is realized. The statement of this result relies on the definition of a -algebra on the uncertainty set , with assigned probability

. This confers necessary measurability properties of the probability space that enable the use of fundamental results from statistical learning theory; see, e.g.,

[32] and references therein.

In order to proceed we provide some basic definitions. Let be a single-valued mapping from the set of -multisamples to the set of equilibria of ; for notational convenience, the dimension of the domain of is to be considered implicitly determined by the size of the multisample.

Definition 4 (Support sample [33])

Fix any i.i.d. multisample , and let be a solution of the finite minmax game . Let be the solution obtained by discarding the sample . We call the latter a support sample if .

The next definition is adapted from [29]:

Definition 5 (Compression set)

Fix any i.i.d. multisample , and let be a solution of the finite minmax game . Consider any subset , and let . We call a compression set if .

The notion of compression set has appeared in the literature under different names; its properties are studied in full detail in [29], where it is designated as support subsample. Here we adopt the term compression set as in [37, 36] to avoid confusion with Definition 4.

Let be the collection of all compression sets associated with the -multisample . We refer to compression cardinality as the size (where returns the cardinality of its argument) of some compression set . Note that —hence in general also —is itself a random variable as it depends on the multi-sample .

Given , let designate the NE set of the game , defined over the scenarios . Then, for all , let

(6)

Finally, following [29], let be a function satisfying

(7)

for any fixed . We make the following assumption:

Assumption 6

Fix any -multisample , and let . It holds that

(8)

We can now state our main result:

Theorem 7

Fix and let be defined as in (7). Under Assumptions 12 the following hold:

  1. There exists a single-valued decentralized mapping from the set of feasible strategies to the set of equilibria of ;

  2. Consider also Assumption 6, and let , with being independent random samples from . Then

    (9)

    where is the cardinality of any given compression set of .

See Section V-B.

Theorem 7 (point (i)) shows that a single-valued mapping indeed exists, and provides a decentralized way to construct it without imposing (standard) strong requirements on the monotonicity of the game. Theorem 7 (point (i)), shows that any solution returned by this mapping can be endowed with probabilistic guarantees on its robustness against uncertainty. The level of guarantee is determined in a wait-and-judge fashion [38], as it depends on the observed compression cardinality , which in turn depends on the samples .

Iii-C Discussion

A fundamental interpretation of Theorem 7 is that it quantifies, with a given confidence level, the probability that the NE , computed on the randomly extracted samples , remains a solution of the game when a new sample is produced. We wish to emphasize that the nature of Assumption 6 is generally nonrestrictive. For instance, it is satisfied if is a continuous distribution and is a finite-order polynomial, thus the set of maximisers of resides in a lower-dimensional manifold of the space where the uncertainty set resides. The example of Section VI where the price function depends affinely on satisfies this assumption. A significant result still holds should Assumption 6 be relaxed. To this end, let . From arguments discussed in detail in Section V-B, it follows that

(10)

As a consequence of (10), the probability that the (predicted) individual utility achieved by a given NE of is not met by effect of a new uncertainty realization is, with confidence at least , lower than .

A tighter bound could be obtained by means of the results of [38]; however, this would require the imposition of a non-degeneracy assumption on the problem, which cannot generally be verified even in convex settings. The notion of degeneracy (linked to redundancy of support samples in convex settings) implies that solving a problem using only the support samples does not lead to the same solution had all the samples been employed—i.e., support samples as identified by Definition 4 form a strict subset of any compression set in .111For a deeper discussion on degeneracy and minimal support in scenario-based contexts, we refer the reader to [39, 38]. Moreover, here we only assume that, for any , is weakly convex, thus imposing a non-degeneracy assumption would be quite restrictive. In view of not restricting the class of problems considered, we limit attention to confidence levels in the form of (7).

Note that, in practice, is bounded by by Definition 5. An a posteriori estimate of the compression cardinality can be obtained through different methodologies, whose design may be tuned on the specific case. It follows from (9) that, the closer the estimate to the minimal cardinality of the compression sets in , the stronger the probabilistic guarantees on the robustness performance of the solution. For completeness, Algorithm 1 reports from [29, §II] a greedy procedure of general application, which is a straightforward implementation of Definition 4.

1:, ;
2:for all  do
3:     ;
4:     if  then
5:          ;
6:     end if
7:end for
Algorithm 1 A posteriori computation of a compression set

Algorithm 1 allows to estimate (an upper bound to) the minimal compression cardinality. In presence of degenerate samples, the algorithm can be run repeatedly to achieve an irreducible compression set. However, Algorithm 1 presents two major drawbacks: the computational cost is generally high, as the value of —i.e., a solution of the game —is required at least times; in practice, limited numerical accuracy makes the evaluation of the condition at Step 3 a hard task. Nevertheless, we show in Section V-A that an efficient approach—based on the direct observation of the solution—is possible in the considered setting.

Iv Decentralized NE computation

In this section we study in more detail the existence of a single-valued mapping , necessary for the proof of Theorem 7. In particular, we consider the case in which the image of corresponds to the limit of a decentralized solution algorithm for the game . To address this, we characterise the NEs of as solutions of variational inequalities (VI). Established results in this framework allow to define sufficient conditions for the existence of equilibria, and set the foundations for the design of decentralized solution procedures [8].

Iv-a VI analysis

At the core of the use of the VI framework for the modelling of noncooperative games solutions is the correspondence between the so-called VI problem, which takes the form (for a given domain )

(11a)
(11b)

and the first-order optimality conditions corresponding to a NE (see Definition 3) [40, §1.4.2]. This model naturally hinges on the differentiability of the problem at hand; however, we can observe that, due to the operator, the players’ objective functions defining are in general nondifferentiable.

With this in mind, let us define the augmented game between players. In each agent , given and , will compute

(12)

where follows from the equivalence

(13)

with being the simplex in  [41, Lemma 6.2.1]. The additional player (the coordinator), given , will act instead as a maximizing player for the uncertain component of , ,

(14)

The following relation holds between and :

Lemma 8 (Thm. 1 [35])

Let the pair be a NE of the game . Then is a NE of .

Note that is differentiable under Assumption 1. Therefore this fundamental result (holding for all randomly extracted set of uncertainty realizations ) enables the characterization of the NEs of by means of VIs.

We start by defining the mapping as the pseudo-gradient [40, Sec. 1.4.1]

(15)

By letting and we see that (11b) represents the concatenation of the first-order optimality conditions for the individual problems described by (12) and (14). In the following, we refer to the problem described by (11) and (15) as VI.

It turns out that under Assumption 1 any NE of can be expressed as the solution of the VI [40, Prop. 1.4.2]. A link with the equilibria of is formalised next:

Proposition 9

Let Assumption 1 hold, and be a solution of the VI. Then

  1. is a NE of ;

  2. The set is nonempty.

(i): By [40, Prop. 1.4.2], is a solution of if and only if it solves the VI; then the statement follows readily from Lemma 8.222Compactness of in Assumption 1 is only needed for (ii), closedness is sufficient for (i). (ii): Given Assumption 1 and the compactness of , the VI has at least one solution [40, Cor. 2.2.5]. Nonemptiness of then follows from the previous point.

Iv-B Monotonicity of

The development of algorithms for the solution of VI problems relies upon the monotonicity of the mapping in (11), which plays a role analogous to convexity in optimization [8].

Definition 10 (Monotonicity)

A mapping , with closed and convex, is

  • monotone on if , and

  • strongly monotone on if there exists such that ,

for all .

The following result is instrumental in our discussion:

Lemma 11

Let Assumptions 1 and 2 hold. Then

  1. in (15) is monotone on ;

  2. The game admits multiple NEs.

(i): First, note that by Assumption 2 is continuously differentiable on its domain. Let and respectively denote the first and the last rows of , i.e., , and . By definition of Jacobian we have

(16)

where and (similar definitions apply to the remaining terms). Assumption 1 implies the existence of and such that, for all ,

(17)
(18)

with for the convexity assumption to hold. Summing the above inequalities yields

(19)

which corresponds to and in turn, from (16), implies for all . The statement then follows directly from [40, Prop. 2.3.2].
(ii): By [8, Thm. 41] the monotonicity of implies the VI admits multiple solutions: this together with [40, Prop. 1.4.2]—stating the correspondence between the solutions of the VI and the NEs of —concludes the proof.

Iv-C Decentralized algorithm for monotone VI and equilibrium selection

There are two main challenges: firstly, due to the possible presence of multiple equilibria, standard decentralized algorithms for VIs are not guaranteed to converge on monotone problems; a tighter condition, namely strong monotonicity, is required on the VI mapping . Secondly, given the previous point is addressed, a tie-break rule needs to be put in place to select a unique solution in the presence of multiple NEs and fulfil the single-valued character of as required by Theorem 7.

A solution to the first issue comes from [42, 8]: these results show that proximal algorithms can be employed to retrieve a solution of a monotone VI by solving a particular sequence of strongly monotone problems, derived by regularizing the original problem. We wish to emphasize that despite a single solution is obtained through these techniques, a deterministic behaviour of the associated iterative algorithm cannot be ensured.333A deeper discussion on this point is out of the scope of this paper. In other words, the aforementioned algorithms still correspond to multi-valued mappings. Interestingly, the work of [8] provides a workaround for this, hence allowing to address the second issue. In particular, [8, Algorithm 4] allows to solve, as a specific case,

(20a)
(20b)

where is monotone. By (20) the minimum-norm equilibrium of the game can be specified as the limit point of the algorithm, thus recovering a suitable formulation for the single-valued mapping .

Therefore, we proceed by considering the regularized game where and are the designated step size and centre of regularization, respectively. Then, given the tuple , each player solves the following problem

(21)

while the coordinator (player ), given , solves

(22)

with . Note that Assumption 1 still holds for (21)–(22). By taking the pseudo-gradient of the above as in (15), we have from [40, Prop. 1.4.2] that is a NE of if and only if it satisfies the VI

(23)

The next lemma is key to the use of decentralized algorithms for strongly monotone VIs on our case.

Lemma 12

Let Assumptions 1 and 2 hold; let be defined as in (15), and given. Then, for any and , the regularized game defined by (21)–(22) has a unique NE.

Let . We note that since, for all ,

(24)

where the last inequality follows from Lemma 11. By definition of convexity this implies for all , corresponding to the definition of strongly monotone mapping. The statement is then implied by [8, Thm. 41].

Now let denote the solution of the VI. Building on Lemma 12, the idea is to achieve a NE of by updating the centre of regularization of on the basis of an iterative method in the form , until convergence to the fixed point corresponding to the NE of satisfying (20). Algorithm 2 allows to establish such a connection, formalised in Lemma 13.

1:, , ,
2:
3:repeat
4:     
5:     repeat
6:          for  do
7:                                    
8:          end for
9:                                   
10:          
11:     until 
12:     
13:     
14:until 
Algorithm 2 Proximal decomposition algorithm
Lemma 13 (Thm. 21 [8])

Consider the minmax game defined by (12)–(14) and the regularized augmented game defined by (21)–(22), and let Assumptions 1 and 2 hold. Let be any sequence satisfying for all , , and . Let be such that with steps 3–11 of Algorithm 2 constitute a block contraction [43], and let denote the sequence generated by the Algorithm. For any , there exists such that is bounded, and solution of (20) such that for . Moreover, .

By [8, Thm. 21], Algorithm 2 asymptotically converges to a solution of (20). By Proposition 9, (20b) is equivalent to the game , whose solution set is nonempty and—by Lemma 8—contained in . The main implication of Lemma 13 is to provide a practical implementation of the single-valued mapping , thus formally proving point (i) in Theorem 7.

V A posteriori analysis

V-a Computation of the compression cardinality

We note that (20) is convex under Assumption 1 [8, §IV-C]; hence, by Definition 4, the set of support samples is necessarily a subset of the active constraints. From this observation we derive the following result, providing an efficient means of estimating the compression cardinality for the considered noncooperative minmax game:

Proposition 14

Let Assumptions 12 hold, and consider the function defined in (12). For any solution of the augmented game , there exists a compression set of cardinality , where .

For any fixed , let us consider the following epigraphic reformulation of

(25a)
(25b)
(25c)

for all , and let be a given equilibrium. Then will express the maximum value—common to all agents—of , achieved for some . We can thus designate the active uncertainty samples by the set ; this gives a practical bound for the compression cardinality as . By direct computation of the KKT optimality conditions [41, §6.2.1] corresponding to (25) and to (14), respectively, it can be verified that the decision variable introduced in the augmented game of (12)–(14) is a shadow price for the constraint (25c). By the complementary slackness condition we then have

(26)

hence . We distinguish two cases: in the first case, (25c) is non-degenerate, then and the claim holds; we show next that the claim also holds when (25c) is degenerate. In this case, we have by (26) that the corresponding shadow price for the inactive samples is 0; for all active samples , on the other hand, . Now, we observe that any active sample belongs either to the set of support samples (henceforth denoted as ), or to the set , for at least a , i.e., it is part of some compression set (see Definition 5) but it is not of support as per Definition 4. We recall that samples belonging to both categories are active by construction, and that for all .

Now, pick any , and suppose by contradiction that, for any such that , it holds . By (12), for all agents this implies that the best response—as computed via —will only depend on the samples in , since those in vanish from as their corresponding . This is tantamount to say that the samples in form a compression set, i.e., , as the solution computed using the entire -multisample does not change by using only those in . However, this contradicts the degeneracy hypothesis, thus proving that for all there exists such that and , i.e., . To conclude, note that the same arguments above (and uniqueness of the solution of ) imply .

As a result, provides an exact enumeration of the support constraints in the non-degenerate case (hence of a minimal compression set); in the degenerate case, identifies some compression set of the -multisample used for the derivation of the robust NE . We wish to emphasize the a posteriori nature of this result, as both and depend on the particular extraction . It is also worth remarking that we do not provide here guarantees that an irreducible compression set is achieved by the methodology associated with Proposition 14; this can be however obtained by Algorithm 1, as specified in Section III-C (see also [29]). The important implication of Proposition 14 is that such an estimate is readily available with the solution of computed by means of Algorithm 2.

V-B Proof of Theorem 7

(i): A proof for the existence of a single-valued mapping is provided by the construction of Section IV-C. In this regard, we emphasize that Assumptions 12 are specifically tailored to our interest in a multi-agent setting: they may be relaxed in case corresponds to a centralized algorithm.
(ii): Fix any . Let , and recall that due to point (i) this is a single-valued mapping from to the decision space. Consider also the (unique) pair , where is as in (25). For any , let . It follows readily that

(27)

By the last statement and point (i), [29, Assum. 1] is fulfilled. Therefore, by [29, Thm. 1] we have that

(28)

where is the cardinality of some compression set in ; recall that depends on . Note that (28) corresponds to (10).

We now proceed to demonstrate the claim in (9). Recall that, by (15), (20) and Proposition 9, we can obtain as solution of the following optimization program (note the slight abuse of notation as by we denote both the optimizer and the corresponding decision variables).

(29a)