Potential Games Design Using Local Information

07/16/2018 ∙ by Changxi Li, et al. ∙ Harbin Institute of Technology NetEase, Inc 0

Consider a multiplayer game, and assume a system level objective function, which the system wants to optimize, is given. This paper aims at accomplishing this goal via potential game theory when players can only get part of other players' information. The technique is designing a set of local information based utility functions, which guarantee that the designed game is potential, with the system level objective function its potential function. First, the existence of local information based utility functions can be verified by checking whether the corresponding linear equations have a solution. Then an algorithm is proposed to calculate the local information based utility functions when the utility design equations have solutions. Finally, consensus problem of multiagent system is considered to demonstrate the effectiveness of the proposed design procedure.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Game-theoretical control has drawn considerable attention in recent years due to its widespread applications. Some representative works include: (i) consensus/synchronization of multi-agent systems [1]; (ii) distributed optimization [2, 3]; (iii) control in wireless networks; (iv) optimization in energy [4] and transportation networks [5, 6], just to name a few.

The content of game-theoretical control is using game theory to solve control problems in interacting setting, such as multiagent systems [2, 3]. Addressing such issues via game theory needs two steps. The first step is to view the agent as an intelligent rational decision-maker in a game with defining a set of available actions and utility function for every player. The second step is to specify a learning rule for the designed game so that the agents can reach a desirable situation, e.g., a Nash equilibrium. Therefore, there are two basic tasks in game-theoretic control: utility design and learning rule design [7]. Utility functions describe components’ incentives, and learning rules mean how each player processes its available information to formulate a decision.

Compared with traditional methods, the advantage of game-theoretical control is that it provides a modularized design architecture, i.e. we can design utility functions and learning rules separately [8]. The separation is described as an hourglass architecture in [7], which is shown in Fig. 1.

Fig. 1: Hourglass Architecture of Game-Theoretical Control

When designing games, one idea is to make sure that the designed game falls under some special category games, such as potential games [9]. One advantage of designing the game as potential game is that there are a variety of learning rules which lead to a Nash equilibrium, e.g. myopic best response, log-liner learning, and fictitious play [10]-[12]. Several papers have devoted to potential game based design in distributed control [13]-[15]. Other game design methods include wonderful life utility design [16], Shapley value utility design [17], congestion game based design [18], etc. However most of the above works provide no systematic methods on designing local information based utility functions. Here local information means that players can only get part of other players’ information when they play the designed game, such as networked game.

This paper focuses on providing a systematic method for designing finite potential game using local information. As far as we know, the most relevant works are [2] and [19]. But our work is different to theirs. [2] provided a systematic methodology for designing potential games with continuous action sets, where the utility functions are local information based. It showed that for any given system level objective function, there exists at least one method to design the local information based utility functions [2]. However, when we turn to games with finite action sets, the existence is not guaranteed. As for [19], it presented a necessary and sufficient condition for the existence of local information based utility functions. But no systematic method is provided for designing local information based utility functions. Furthermore, the learning rule used in [19] is better reply. Using better reply, local information based potential game can converge to a Nash equilibrium, but may not a maximum point of the system objective function.

The contributions of this paper are threefold: (i) A necessary and sufficient condition for the existence of local information based utility functions is obtained, which can be verified by checking whether the corresponding linear equations have solutions. (ii) A method for designing finite potential game using local information is presented when the linear equations have a solution. (iii) An example on consensus problem is provided to demonstrate the effectiveness of the design procedure.

The rest of this paper is organized as follows: Section II provides some preliminaries, including semi-tensor product (STP) of matrices, game theory, and problem description. Section III considers the design of potential game using local-based information. Section IV considers application of the design method to consensus problem. A brief conclusion is given in Section V.

Notations: is denoted by the Euclidean space of all real

-vectors.

is the set of real matrices. . is the

-dimensional identity matrix.

is the

-dimensional zero matrix.

. is denoted by the -th column of the identity matrix . () is the set of columns (rows) of . The transposition of matrix is denoted by is the subspace spanned by .

Ii Preliminaries

Ii-a Semi-tensor Product of Matrices

The basic tool used in this paper is STP of matrices. We give a brief survey on STP of matrices. Please refer to [20] for more details.

Definition II.1

[20] Suppose , , and be the least common multiple of and . The STP of and is defined by

where is the Kronecker product.

Assume . By identifying we call the vector form of integer . A function is called a mix-valued pseudo-logical function.

Definition II.2

[20] Let be a mix-valued pseudo-logical function. Then there exists a unique row vector , such that

is called the structure vector of , and .

Ii-B Potential Game

A finite non-cooperative game is a triple , where is the set of players, is the set of strategies of player for every , and is the utility function of player , with being the strategy profile of the game. Let be the set of partial strategy profiles other than player . Denote by set of form finite games with , , .

Using the vector expression of strategies, the utility function can be expressed as

where , and is called the structure vector of , .

The concept of potential game was firstly proposed by Rosenthal [21], whose definition is as follows:

Definition II.3

A finite game is a potential game if there exists a function , such that for every player and every

where is called the potential function of .

The following Lemma is obvious according to Definition II.3.

Lemma II.4

[22] A finite game is potential if and only if there exist functions such that for every

(1)

where is the potential function, and .

Ii-C Problem Setup

Consider a multi-player game played on the network, which is called networked game (NG). In fact every player can only obtain its neighbours’ information when playing game . The neighbors of player is denoted by , which is defined as follows,

Suppose a system level objective function is given, where . The system wants to optimize the objective function, while all players can only obtain its neighbour s information. The optimization problem can be described as

where

To solve this optimization problem, the idea is to design a potential game in which the utility function of every player only depends on its neighbours’ information, and the potential function of the designed game is

. Then using proper learning algorithm, such as logit learning

[10], players’ behavior converges to a strategy profile that maximize the objective function.

Iii Potential Game Design Using Local-based Information

Iii-a Utility Design Using Local Information

We consider the design of potential game using local information in this subsection. Before designing the game, an operator , called the -drawing matrix, is necessary, where is a group of players in the game . Set

where

The -drawing matrix is used to “draw” the strategies of players in from [19]:

where is the strategy of player . Particularly, where is the set of players except player .

Theorem III.1

Consider a utility-adjustable networked game with objective function

Then local information based utility function can be designed if and only if all the following equations have a solution

(2)

where , , , , , , and , .

Moreover if the solution exists, the local information based utility function of player is

(3)

Proof: If the utility function of the networked game is local information-based, then we have

Rewrite (1) into vector form, and substituting (3) into (1) yields

where and are the structure vectors of and , respectively.

Since are arbitrary, we have

(4)

which is equivalent to (2).

If the solution exists, then we have

which verifies equation (3).

Equations (2) are called utility design equations, and are called utility design matrices.

Iii-B Solving Utility Design Equations

In this subsection we first explore some properties of utility design equations. Then we design an algorithm to calculate solutions of utility design equations equations when the solutions exist.

Set

where

Denote by Let

Theorem III.2

Let be the utility design matrix of player in (2). Then

(5)

In other words, every column of is a solution of And

where , and

Proof: We omit the detailed proof due to the space limitation.

Using Theorem III.1 and Theorem III.2, we have the following results:

Corollary III.3

Consider a utility-adjustable networked game with objective function

Then the following four statements are equivalent.

(i)

The local information based utility functions can be designed.

(ii)

The following equations have a solution

(iii)

Suppose is defined in (2). Then,

(6)
(iv)

Proof: (i)(ii): It is obvious using Theorem III.1.

(ii)(iii): From linear algebra we know that condition (6) is the necessary and sufficient condition for equation (2) to have solutions.

(iii)(iv): From Theorem III.2 one sees that

Combining equation (6) we have the following result

which is equivalent to statement (iv). It is obvious that (iv) implies (iii).

In the following we design an algorithm to calculate the local information based utility function for each player when equation (2) has solutions.

Algorithm III.4
  • Construct and for each

  • Determine as

    (7)
  • Define as the sub-vector of the first elements of . Using (3), the general form of are calculated as follows

    where is an arbitrary vector in .

Using Algorithm III.4, the local information based utility function of player is

Remark III.5
  1. The computation complexity of Algorithm III.4 is mainly dependent on the caculation of (7), where the dimension of is It shows that the smaller the number of the neighbors is, the lower the computational complexity is. Further investigation for reducing the computation complexity of Algorithm III.4 is necessary.

  2. The method proposed in this paper can also be applied to design state-based potential game [23] with local information based utilities.

Iii-C Selecting Proper Learning Rule

After designing the utility function, another thing to be considered is selecting proper learning algorithm in potential games. The learning algorithm should ensure that players’ behavior converges to a stategy profile that maximize the objective function using local information. The logit learning, which is shown as follows, satisfies the above demands:

  • At each period , player

    is chosen with probability

    and allowed to update its strategy;

  • At time , the updating player selects a strategy according to the following probability

    where is exploration parameter.

  • All other players repeat their previous actions, i.e., , where

Theorem III.6

[10] Consider a repeated potential game with potential function where all player use logit learning. The stationary distribution is

(8)

where

is the set of probability distributions over

Moreover, the set of stochastically stable states is equal to the set of maximizers of , where a state is called stochastically stable if .

Remark III.7

Since for local information-based potential game we have . Using the above logit learning rule, the local information-based potential game will maximize the potential function with arbitrarily high probability when is sufficiently large. For general potential game if only the local information is allowed to use, the result is not assured. Here general potential game is potential game whose utility functions are not local information based.

Iv Application: Consensus Problem

In this section we deal with consensus problem using the above results.

Consider a multi-agent system with a system level objective function . The goal of the multi-agent system is to maximum the objective function. Due to its mobility limitations, agent can only select strategies from a restricted strategy set. For example, a robot in 2-D environment can only move to a position within a radius of its current location. Agent can communicate with its neighbors, which means that at each period the agent can only observe its neighbors’ strategies.

A question is: If the player can only select strategies from a restricted strategy set, can the local information-based potential game maximize the objective function when is sufficiently large using logit learning. Unfortunately, the answer is no. But binary restrictive logit learning, which was introduced in [1], can accomplish the aim. Denote by the set of strategies available to player at time . The binary restrictive logit learning is as follows:

  • At each period , player is chosen with probability and allowed to update its strategy;

  • At time , the updating player selects a trial strategy from according to the following probability

    where .

  • After selecting a trial strategy , player chooses its strategy as follows:

    (9)

    where is exploration parameter, and

  • All other players repeat their previous actions, i.e., , where

If the restricted strategy sets satisfy the following conditions

  1. Reversibility:

  2. Feasibility: For any two strategies there exists a series of strategies such that

Then using the above binary restrictive logit learning potential game with restricted strategy sets will maximize its potential function with arbitrarily high probability when is sufficiently large [1].

Example IV.1

Consider a multi-agent system with three agents . The system level goal is that all agents gather at point in 2-D environment, such as a room. Each agent has a strategy set . Due to its mobility limitations, every agent can only move to a position within a radius of its current location, and there is an obstacle in , as shown in Fig. 2. The communication graph is time-invariant, which is shown in red line. Agent can only communicate with agent . Agent can communicate with agent and . Agent can only communicate with agent .

Fig. 2: Initial Configuration and Network Graph of Example IV.1

a). Vector expression of strategies

Identify

The system level objective function can be described as

(10)

where . Using the vector expression of strategies, the structure vector of can be calculated as

b). Local information based utility design

Set

where , and

We can verify that the following equation has solutions

According to Theorem III.1, the local information based utility function is designable. The designed local information based utility function has the following form:

where , and

Using (3), the general form of are calculated as follows

where and are arbitrary vectors.

c). Simulation results

The initial configuration of all agents is shown in Fig. 2. Let , and Using binary restrictive logit learning with parameter , we have the following simulation results. As , local information based potential game converges to an equilibrium point which maximizes the potential function. Then all agents agree to stay at destination forever, which are shown in Fig. 3-Fig.5. Here

Fig. 3: Evolution of Agent 1’s Position of Example IV.1
Fig. 4: Evolution of Agent 2’s Position of Example IV.1
Fig. 5: Evolution of Agent 3’s Position of Example IV.1

V Conclusion

This paper investigates the design of potential game using local information. We firstly present a necessary and sufficient condition for the existence of local information based utility functions, which can be verified by checking whether a series of linear equations have a solution. Local information based utility functions can be designed using the solutions when the linear equations have solutions. Then a consensus problem is used to demonstrate the effectiveness of the proposed design approach.

Open and interesting questions for further investigations include: If the utility functions cannot be designed as a potential game using local information, can we design a near-potential game [24]? To maximize the system level objective function, how “near” the designed game should be for a given learning rule?

References

  • [1] J. R. Marden, G. Arslan, and J. S. Shamma, “Cooperative control and potential games,” IEEE Trans. on Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 39, no. 6, pp. 1393-1407, 2009.
  • [2] N. Li and J. R. Marden, “Designing games for distributed optimization,” IEEE Journal of Selected Topics in Signal Processing, vol. 7, no. 2, pp. 230-242, 2013.
  • [3] B. Yang and M. Johansson, “Distributed optimization and games: A tutorial overview,” Networked Control Systems, vol. 406, pp. 109-148, 2010.
  • [4] W. Saad, Z. Han, H. Poor, and T. Basar, “Game-theoretic methods for the smart grid: an overview of microgrid systems, demand-side management, and smart grid communications,” IEEE Signal Process. Mag., vol. 29, pp. 86-105, 2012.
  • [5] P. N. Brown and J. R. Marden, “Studies on robust social influence mechanisms: Incentives for efficient network routing in uncertain settings,” IEEE Control Systems, vol. 37, no. 1, pp. 98-115, 2017.
  • [6] X. Wang, N. Xiao, T. Wongpiromsarn, L. Xie, E. Frazzoli, and D. Rus, “Distributed consensus in noncooperative congestion games: an application to road pricing,” in Proc. 10th IEEE Int. Conf. Contr. Aut., Hangzhou, China, 1668-1673, 2013.
  • [7] R. Gopalakrishnan, J. R. Marden, and A. Wierman, “An architectural view of game theoretic control,” ACM SIGMETRICS Performance Evaluation Review, vol. 38, no. 3, pp. 31-36, 2011.
  • [8] J. R. Marden and J. S. Shamma, “Game theory and distributed control,” Handbook of Game Theory with Economic Applications, vol. 4, pp. 861-899, 2015.
  • [9] D. Monderer and L.S. Shapley, “Potential Games,” Games and Economic Behavior, vol. 14, pp. 124-143, 1996.
  • [10] C. Al s-Ferrer and N. Netzer, “The logit-response dynamics,” Games and Economic Behavior, vol. 68, no. 2, pp. 413-427, 2010.
  • [11] H. P. Young, Strategic Learning and Its Limits. Oxford, U.K.: Oxford Univ. Press, 2004.
  • [12] J. S. Shamma and G. Arslan, “Dynamic fictitious play, dynamic gradient play, distributed convergence to Nash equilibria,” IEEE Trans. Autom. Control, vol. 50, no. 3, pp. 312-327, 2005.
  • [13] E. Campos-Nanez, A. Garcia, and C. Li, “A game-theoretic approach to efficient power management in sensor networks,” Operations Research, vol. 56, no. 3, pp. 552-561, 2008.
  • [14] T. Heikkinen, “A potential game approach to distributed power control and scheduling,” Computer Networks, vol. 50, 2295-2311, 2006.
  • [15] J. R. Marden and A. Wierman, “Overcoming limitations of game-theoretic distributed control,” in Proc. 48th IEEE Conf. Decision Control, pp. 6466-6471, 2009.
  • [16] J. R. Marden and M. Effros, “The price of selfiness in network coding,” IEEE Trans. Inf. Theory, vol. 58, no. 4, pp. 2349-2361, 2012.
  • [17] E. Anshelevich, A. Dasgupta, J.Kleinberg, E. Tardos, T. Wexler, and T. Roughgarden, “The price of stability for network design with fair cost allocation,” SIAM J. Comput., vol. 38, no. 4, pp. 1602-1623, 2008.
  • [18] Y. Hao, S. Pan, Y. Qiao, and D. Cheng, “Cooperative control via congestion game approach,” arXiv preprint, arXiv: 1712.02504, 2017.
  • [19] T. Liu, J. Wang, and D. Cheng, “Game theoretic control of multi-agent systems,” arXiv preprint, arXiv: 1608.00192, 2016.
  • [20] D. Cheng, H. Qi, and Y. Zhao, An Introduction to Semi-tensor Product of Matrices and Its Applications, World Scientific, Singapo, 2012.
  • [21] R. W. Rosenthal, “A class of games possessing pure-strategy Nash equilibria,” Int. J. Game Theory, vol. 2, pp. 65-67, 1973.
  • [22] D. Cheng, “On finite potential games,” Automatica, vol. 50, no. 7, pp. 1793-1801, 2014.
  • [23] J. R. Marden, “State based potential games,” Automatica, vol. 48, no. 12, pp. 3075-3088, 2012.
  • [24] O. Candogan, A. Ozdaglar, and P. A. Parrilo, “Dynamics in near potential games,” Games and Economic Behavior, vol. 82, pp. 66-90, 2013.