Egoistic Incentives Based on Zero-Determinant Alliances for Large-Scale Systems

01/08/2020 ∙ by Shengling Wang, et al. ∙ Beihang University Beijing Normal University Indiana University George Washington University 0

Social dilemmas exist in various fields and give rise to the so-called free-riding problem, leading to collective fiascos. The difficulty of tracking individual behaviors makes egoistic incentives in large-scale systems a challenging task. However, the state-of-the-art mechanisms are either individual-based or state-dependent, resulting in low efficiency in large-scale networks. In this paper, we propose an egoistic incentive mechanism from a connected (network) perspective rather than an isolated (individual) perspective by taking advantage of the social nature of people. We make use of a zero-determinant (ZD) strategy for rewarding cooperation and sanctioning defection. After proving cooperation is the dominant strategy for ZD players, we optimize their deployment to facilitate cooperation over the whole system. To further speed up cooperation, we derive a ZD alliance strategy for sequential multiple-player repeated games to empower ZD players with higher controllable leverage, which undoubtedly enriches the theoretical system of ZD strategies and broadens their application domain. Our approach is stateless and stable, which contributes to its scalability. Extensive simulations based on a real world trace data as well as synthetic data demonstrate the effectiveness of our proposed egoistic incentive approach under different networking scenarios.



There are no comments yet.


page 8

page 9

page 10

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Many human decisions occur in situations where the results of one’s own decisions are interdependent with those of others. Such interdependence situations tend to breed social dilemmas, which has two traits: 1) each individual who makes social defective choices gains higher profits no matter what other individuals do; 2) comparing with the situation where everyone cooperates, if everyone chooses to defect, then all individuals get lower returns [6]. Therefore, a social dilemma is essentially an inherent conflict between defection and cooperation in a system where the former is the dominant strategy for each individual but the latter can maximize the overall social welfare.

Social dilemmas exist in various scenarios such as data delivery in mobile opportunistic networks [17], file sharing in peer-to-peer networks [8], and autonomous vehicle programming [2]. A typical issue resulted from a social dilemma is free-riding, an individually rational but socially defecting choice which can lead to a collective fiasco. Moreover, such a tragedy of the commons would be aggravated as the system expands because the behavior of an individual becomes more difficult to track so that the influence on others diminishes [7]. Hence, there is a pressing need to induce cooperation, especially in a large-scale system.

Cooperation in a social dilemma is often explained in terms of egoistic incentives [5]. The state-of-the-art egoistic incentive mechanisms realize their aims by transforming a social dilemma game into one not involving a dilemma, which can be categorized into two types: reputation-based [10, 18, 8, 3, 13] and pricing-based [12, 4, 9, 19]. Reputation-based schemes employ reputation or quasi-reputation to evaluate a node’s contribution to others, based on which reacts to its service requests, while pricing-based mechanisms treat service provision as a transaction, taking advantage of monetary incentives to stimulate cooperation.

Existing incentive mechanisms have two common traits: they all are individual-based and involve state (reputation or transaction status) maintenance and management. The first trait comes from the underlying concept in which the success of cooperation inducement to each individual inevitably results in that of the whole system. Such a case-by-case based approach is obviously inefficient as the size of the system enlarges. The second trait exists due to the state-dependent nature of the state-of-the-art methods. Regretfully, the cost of maintaining and managing states would soar in a large-scale system.

In this paper, we realize large-scale egoistic incentives from the perspective of the network rather than an individual. Our idea stems from the following observation: each person in this world is more or less socially connected to others, forming a so-called social community. Within such an environment, cooperation and competition among all members jointly determine the utility of an individual. Since each individual is utility-driven, cooperation inducement requires us to analyze the game among all players in the social community, which involves the structure of their social ties. Hence, it is reasonable to consider incentives from the connected (network) perspective rather than the isolated (individual) perspective to stimulate cooperation. Moreover, directly incentivizing the whole system has the potential to bring high efficiency to a large-scale system.

Our egoistic incentive approach has two desired properties: statelessness and stability. These merits are obtained by taking advantage of the zero-determinant (ZD) strategy [15] whose adopter (the ZD player) can control its opponent’s payoff in a unilateral way. Thus, through rewarding cooperation and negatively sanctioning defection, a ZD player can stimulate cooperation of its co-players. After proving cooperation is the dominant strategy of the ZD players, we optimize their deployment, facilitating cooperation over the whole system. Our method is stateless because it does not need to manage or maintain any state; it is stable since the deployment of the ZD players only depends on the social ties among the players, which are usually steady. The statelessness and stability contribute to the sound scalability of our egoistic incentive.

Another contribution of this paper is the derivation of a ZD alliance strategy for sequential multiple-player repeated games, where a ZD alliance refers to a group of players taking the same ZD strategy. Such a derivation brings two benefits. First, the alliance strategy widens the range of a co-player’s utility that ZD players can set, implying that ZD players can achieve higher controllable leverage through allying. The increased controlling power of a ZD alliance breeds an environment where cooperation thrives. Second, our derivation enriches the theoretical system of ZD strategies, broadening their application domain.

We conduct extensive simulations based on real-world trace data as well as synthetic data, which represent different types of typical network topologies including star, ring, tree, and mesh structures. These simulation results demonstrate the effectiveness of our egoistic incentive approach.

The rest of the paper is organized as follows. The most related work is investigated in Section 2. Section 3 presents our problem formulation. The ZD alliance strategy for a sequential multiple-player repeated game is deduced in Section 4, and our incentive algorithm is proposed in Section 5. We report our simulation results on the proposed algorithm in Section 6, and Section 7 concludes the paper.

2 Related Work

Over the last two decades, research on cooperation induction has made considerable progress, which can be categorized into two types: reputation-based and pricing-based.

Reputation-based methods employ the concept of reputation or quasi-reputation to evaluate the behaviors of a node based on its contributions to others, which is also a criterion for reacting to the service requests of this node. For example, in [10], relay nodes would get good reputation values for their cooperation, which can build other nodes’ confidence on them thus helping forward their bundles. The in-network realization of incentives was proposed in [18] to attach an explicit ranking to a node in light of its transit behaviors and translate the rank into message priority. Hu et al. [8] proposed a budget-based self-optimized incentive search protocol for unstructured P2P file sharing systems, motivating selfish nodes to earn more credits by providing services to others. As a game theoretical incentive, Multicent [3] assigns credits for packet forwarding/storage in proportional to the priorities specified in the routing strategy. The concept of virtual credit was adopted in [13] to encourage selfish nodes to cooperate in data forwarding.

Pricing-based methods take advantage of monetary incentives to stimulate cooperation. For instance, Ning et al. [12] introduced the concept of virtual checks to pay the cooperation of selfish nodes for ad dissemination in autonomous mobile social networks. Chen et al. [4] proposed an auction-based incentive mechanism for paid content offloading considering the dual identity of service providers. Koutsopoulos [9] cast participatory sensing in the context of optimal reverse auction design, offering reasonable payments for data contributors. Yang et al. [19] also proposed an auction-based mechanism to incentivize participants of mobile phone sensing while allowing users to have more control over the payment they can receive.

Both reputation-based and pricing-based egoistic incentives aim to shape the behaviors of individuals and involve maintaining and managing the reputation or transaction states, leading to low efficiency in a large-scale system. In this paper, we take a dramatically different approach from the perspective of a network rather than an individual to achieve good scalability by employing a ZD strategy. ZD strategies [15] were firstly proposed by Press and Dyson in 2012, providing us a revolutionary understanding of simultaneous-move repeated games. ZD strategies can enforce a fixed linear relationship between the expected payoffs of two players, indicating that a ZD player can control its opponent’s payoff in a unilateral way. Since ZD strategies were proposed, several studies have been carried out to enrich their theoretical hierarchy. For example, the application of ZD strategies was extended from a two-player simultaneous-move game to a multi-player one in [14]. The concept of ZD alliance was first proposed in [7], whose strategy was explored for simultaneous-move multi-player games. In this paper, we employ a different theoretical approach to deduce the strategy of a ZD alliance for sequential multi-player games to serve our problem scenario.

3 Problem Formulation

To analyze the behaviors of the players from a network perspective, we employ an indirect graph to describe a large-scale system, where is the set of nodes representing the players in the system and is the set of edges denoting the social ties of the players. Each player has two choices in the game, namely cooperation and defection . For example, in mobile opportunistic networks, means a player’s willingness to transmit data for others and implies rejecting relay services. Even though and have different meanings in different scenarios, they own common traits: the former is driven by social interests while the latter involves that of an individual. In practice, an application may involve multiple rounds and each player may choose or within a round; and some players may move first while the others take actions after observing the first movers. As a result, we focus on a sequential repeated game with multiple players in this paper.

In any game, a rational player aims to maximize its utility, which depends on not only the action of itself but also those of others. In this paper, we adopt a method similar to that in [7, 14] to calculate the utility of a player when the number of cooperators among its neighbors is :


where means player selects cooperation while implies defection; is the number of neighbors of player , and obviously ; and is the profit proportional to . We use in sequel for simplicity.

Theorem 3.1

A social dilemma happens when .


According to [1], a social dilemma occurs when the utility of each player possesses the following properties: 1) a player, regardless its decision, can get a higher utility when more co-players select cooperation; 2) the utility of a defector is higher than that of a cooperator; and 3) mutual cooperation outperforms mutual defection. As a result, the following inequalities should be satisfied, which are built based on the above three properties:

According to (1), the first two inequalities obviously hold without any constraint while for the last one, it is sufficient that . Hence, the theorem is proved.

According to Theorem 3.1, a defector can obtain a higher utility than a cooperator. However, once all players adopt defection to act as free riders, each of which only gets the utility of 1 according to (1), which is lower than the utility of when all players cooperate because , leading to the tragedy of the commons. To deter free riders, a natural idea is to teach players to get the cognition that cooperation outperforms defection, synchronizing self-interest with the common one. This requires an egoistic incentive by an explicit side payment in the form of rewarding cooperation while punishing defection. However, it is inefficient to employ centralized methods for the egoistic incentives especially in a large-scale system due to the lack of flexibility, robustness, and the single point of failure issue. Hence, in this paper, we resort to a distributed one.

To realize a distributed egoistic incentive, we take advantage of ZD strategies [15]. A ZD player can set its opponent’s payoff irrespective of what action the opponent would take. Such capability is very valuable for shaping the behavior of free riders. Hence, in this paper, we address the social dilemma problem through the deployment of ZD players forming an alliance in the network. We call players not adopting the ZD strategy regular players. Then, as shown in Fig. 1, in a network with heterogeneous players, the utility of any regular player may consist of two components: the one obtained from the game playing with other regular players and the other from playing with the ZD alliance. The former game is called a regular game while the latter one is called a ZD game. Particularly,


In (2), is the utility obtained from the regular game when player acts playing with cooperators among the regular neighbors, which can be calculated by (1); and is the utility set by the ZD players when player acts and there are ZD players as its neighbors.

Fig. 1: Two types of games involving any regular player.

Let the probability of a player choosing

or be its strategy. Being rational, a player chooses a strategy according to the utility brought by cooperation or defection, which is also affected by the co-players’ moves. Hence, we use the following equation, similar to the one in [5], to determine the strategy of player at each round:


In (3), is the probability that player cooperates. Obviously, the probability that player defects is . Here, depends on the difference between and , the utilities of player adopting and , respectively, when the number of regular cooperators involving in the game is and there are ZD players in the social community111 means player adopts cooperation when the number of other players’ adopting is and hence the total number of cooperators is in this case.. Note that both the number of regular cooperators and that of ZD players can be obtained according to historical information of games. As a result, our optimization problem turns out to be the following: how to deploy ZD players to maximize the overall cooperation probability in the whole system, where each regular player’s cooperation probability is calculated by (3)?

4 ZD alliance strategy in sequential multiple-player repeated game

To solve our optimization problem, we need to determine in (2), the utility set by ZD players when a regular player acts . Note that for any player , there might exist more than one ZD players in its neighborhood. In this case, multiple ZD players must take the same action for the same move made by player because they should abide by consistent rules for being fair-minded regulators in our approach. The ZD players taking the same action form a ZD alliance. In this section, we need to analyze how a ZD alliance sets the utility of a regular player to determine . Particularly, when , a ZD alliance strategy regresses to a classical ZD strategy. Note that, the original or variant ZD strategies [15, 14, 7] are only applicable to simultaneous-move repeated games with two or more players, which are not suitable for our scenario. Hence, we first extend the application domain of ZD strategies for a sequential multiple-player repeated game.

In a sequential multiple-player repeated game, players are classified into two types: the first-move players and the second-move ones. We call them

leaders and followers respectively in this paper. No matter which type a player belongs to, its utility is calculated according to (1), which is affected by the co-players’ moves. Because a leader has to move first, its strategy is made based on all players’ actions in the previous round. Let be the strategy of leader . We have


where each element can be represented in the form of , the probability of leader choosing given that it played previously and the numbers of cooperators among other leaders and followers were respectively and in the previous round; and and are respectively the numbers of leaders and all players in this game.

Since a follower can observe the behavior of the leaders in the current round, its strategy depends on the leaders’ moves. Let be the strategy of follower , which can be represented as


where each element can be written in the form of , the probability of follower choosing when there are cooperators among the leaders.

With the definitions of and , and

, we can construct the following Markov matrix

where each element denotes the one-step transition probability from state to . Here, a state denotes the moves made by all players in this round. The state transition probability is essentially a joint one that can be obtained by

in which and respectively denote the probabilities of actions made by leader and follower in state and they can be calculated as




In (6), also denotes the probability of leader cooperating when it played and the numbers of cooperators among other leaders and followers were respectively and in the last round (state ); and is the action of leader made in state . In (7), is also the cooperation probability of follower when there are cooperators among the leaders in state and is its action in state .

Define a matrix , where

is the unitary matrix. Let

be the stable vector of the transition matrix

; then we have ; hence . Denote by , where is the column vector. In light of Cramer’s rule, , where is the adjugate matrix of . Combining this equation with , we know that is proportional to each row of [7]. Therefore, the dot product of the stable vector and any vector is

Now we apply an elementary column transformation to : find a column corresponding to a state where only one leader cooperates (i.e., the element of that column is in the form of , , ) and a set of columns (denoted as ) corresponding to the states where player and at least one co-player cooperates (i.e., each element of those columns is represented as , , , , , , ); then add all columns in to the first column we found to form a new column whose element is if a diagonal element of is added to this entry and otherwise is , where . Obviously, .

After the elementary column transformation, the matrix has a new form, i.e.,

where is the new column formed according to the above method. More specifically,


It is worthy of noting that the new column can be located at any place of the original indicating a state where only one leader cooperates. Here, we assume leader is such a player, and the corresponding column is located at the position. Notably, the column in (8) is only related to the strategy of leader , denoted as

If we let , where is a coefficient, the column is proportional to the last one, resulting in


To set a suitable , we divide the players in the game into two groups : alliance members () and non-alliance members (), with the former being a subset of the leaders taking the same strategy as leader while the latter including the rest of the leaders and all followers. Let vectors and be respectively the average payoffs of alliance members and the non-alliance members under all possible outcomes, where each element () is the average payoff of a member from the set () when the alliance adopts and there are cooperators in total. Denote by the number of alliance members. According to (1), we have


In (10), all alliance members have the same utility because their behaviors are the same. This is different from the non-alliance members whose utilities depend on not only their actions but also the number of cooperators and defectors from both sets of and .

Based on the above analysis, one can set

where and are coefficients, and is a vector with each element being 1. Let . According to (9), we have


In (11), and are respectively the expected utilities of alliance members and non-alliance members. Recall that is the stable vector of the transition matrix; hence , and . Obviously, (11) shows that when leader sets its strategy to , it can enforce a linear relationship between and , i.e.,


which means that leader acts as a ZD player in this situation. Since leader is a member of the alliance and all alliance members should take the same action, this alliance takes a ZD strategy and is termed a ZD alliance. We call the non-ZD alliance members outsiders. Then, when , the ZD alliance can set the expected utility of the outsiders with .

Theorem 4.1

When , the expected utility of the outsiders satisfies .


According to (12), if , , which is proportional to and . Because the ZD alliance moves first, we start by analyzing the range of from . Given that there are cooperators, according to the nature of social dilemma, we have


In other words, the expected utility of the ZD alliance is higher when it defects rather than cooperates. When the ZD alliance defects, the average utility of the outsiders is ; otherwise, it is . Combining the range of given in (13), we can deduce

where . When and , we can obtain the expected utility range of the outsiders.

According to Theorem 4.1, when , which implies that there is only one ZD alliance member, the controllable range of shrinks to . This demonstrates that a ZD alliance can obtain higher regulation leverage compared to a single ZD player.

Fig. 2: An example sequential three-player repeated game.

We use an example shown in Fig. 2 to explain key terms and definitions involved in this section. Assume that there are three players in the game, among which players 1 and 2 are the leaders while player 3 is the follower. The states of the game include all possible outcomes, namely

Accordingly, the utility of any player in each outcome can be calculated according to (1).

The strategy of each leader and the follower can be represented in light of (4) and (5). For instance, given a previous outcome , the conditional probability under which player 1 adopts in the current round is because it acted and there was no other cooperative leader but one cooperative follower in the previous round; similarly, player 2 adopts with the conditional probability of in the current round since it acted and both the remaining leader and the follower cooperated in the previous round; player 3 is a follower, whose strategy space’s cardinality reduces to , which is much lower than that of any leader (i.e., ). The conditional probability of player 3 adopting is , depending on the number () of cooperators among the leaders in the current round. As a result, the probability from state to any other state can be derived. For example, the transition probability to state is . All state transition probabilities contribute to the transition matrix . Let , which can be denoted as

Next, we perform an elementary column transformation on matrix , identifying a column corresponding to a state where only one leader cooperates (e.g., player 1) and adding to it all the other columns whose states indicate that player 1 cooperates and at least one of players 2 and 3 also selects . The determinant after the elementary column transformation becomes

where the fourth column, denoted by the red color in Fig. 2, is formed according to the above method. We can find that is only dependent on the strategy of player 1. If we apply an elementary column transformation corresponding to player 2, we can obtain the sixth column (denoted by the blue color in Fig. 2) that only relies on player 2.

Let player 1 be the ZD player and ally with player 2. Both can set their strategies as , where and . According to Theorem 4.1, the ZD alliance including players 1 and 2 can set the expected utility of player 3 ranging from to .

Now, it is the time to answer the question proposed at the beginning of this section. Taking advantage of Theorem 4.1, we can set


which means a ZD alliance would reward the cooperation of the outsiders with the highest expected utility while punish their defection with the lowest utility. This extreme reward-punishment incentive mechanism can help to facilitate cooperation as much as possible.

5 ZD alliance for egoistic incentive

Based on our analysis in Section 3, one can see that a rational player would take an action according to the difference between the utilities of adopting and , which are related to the co-players’ moves among its social community. Due to the heterogeneous nature of the members in the social community, the utility of player may come from the regular game and the ZD game, with the former being calculated by (1) while the latter solved by (14). Note that Theorem 4.1 in Section 4 can serve for a more general case, where the number of outsiders is arbitrary. However, in our consideration, for any player , the corresponding ZD game only involves the ZD alliance and the player itself as shown in Fig. 1, implying that the outsider is just player and hence we have .

From the perspective of a single player, one can realize egoistic incentives by rewarding more expected utility when the player chooses and punishing it if it defects. However, to achieve this goal from the network perspective, we need to optimize the deployment of ZD players so that the overall cooperation probability in the whole system can be maximized.

To optimize the above problem, we need to answer such a question: according to (12), , implying that a ZD alliance has the ability to carry out the extortionate strategy by enforcing a ratio between its expected utility and that of the outsider. Moreover, the smaller the , the more the expected utility that can be transferred to the ZD alliance. In this case, as a dominant player, can the ZD alliance push the outsider to cooperate while it intends to defect for obtaining more expected utility? We use the following theorem to answer this question:

Theorem 5.1

The dominant strategy of the ZD alliance is cooperation irrespective of the strategy adopted by the outsider.


In the ZD game, when player takes , the utilities of the ZD alliance when it cooperates and defects are respectively and according to (10). To prove cooperation is the dominant strategy of the ZD alliance, we should prove , which is equivalent to proving due to in our scenario.

In light of Theorem 4.1, the condition under which a ZD alliance can control others is , which can be simplified as due to . Because , we have . As a conclusion, this theorem holds.

When cooperation is the dominant strategy of the ZD alliance, maximizing the overall cooperation probability of the whole system is equivalent to maximizing those of all regular players. Hence, our optimization problem can be written as:


where denotes that player adopts the ZD strategy and means that the player is a regular one; and and are respectively the number of ZD players we deploy and the number of total players in the system.

To find an optimal solution for (15

), we resort to genetic algorithms (GA) because GA is efficient in dealing with problems with nonlinear and multi-constraint properties. As GA has been well-studied, we refer the interested readers to

[11] and omit the algorithm description here.

The basic idea of our egoistic incentive mechanism is to take advantage of the ZD alliance to shape the behaviors of the regular players. According to (3), the cooperation probability of a regular player depends on the number of ZD players () and that of cooperators () among the regular neighbors in its social community. However, due to the nature of social dilemma [1], we can obtain , where is the total number of regular players in the social community. This equation reflects that a player can definitely obtain a higher payoff if it defects rather than cooperates, no matter how many cooperators there exist (i.e., ) in its neighborhood. This makes the strategy of a regular player only relies on , the number of ZD players around . While is determined by the solution of the optimization problem (15), which is closely related to the topology of network . Because is constructed based on the nodes’ social ties, which are relatively steady, is steady. Hence, our method is stable, and does not need to maintain and manage any state; thus achieving the property of statelessness introduced in Section 1.

6 Performance Simulation

In this section, we evaluate the performance of our proposed mechanism using real and synthetic data.

The real data we adopt is the iMote data [16], including traces of Bluetooth sightings by groups of users carrying small devices (iMotes) for a number of days in different offices, conferences, and cities. We use all iMote nodes (54 in total) and their links to perform our simulation. Synthetic data is also employed since we want to test the performance of our approach under different network topologies. The types of networks in our simulation are star, ring, tree, and mesh, each of which includes 80 nodes. More specifically, all nodes in the star network are individually connected to a central node; in the ring network, each node connects to exactly two other nodes, forming a closed loop; each node has two neighbors except the leaf nodes in the tree network, making its shape a binary tree; and in the mesh network, each node has at least two randomly selected neighbors. All simulations have been repeated 30 times to obtain the average values with statistical confidence.

Fig. 3 demonstrates how the average cooperation probability of our mechanism changes with the number of ZD players () deployed in different networks. In this study, we set , where is the total number of players in a game. Note that each regular player may involve in two kinds of games, the regular game and the ZD game as shown in Fig. 1. Hence, varies in different games. According to Fig. 3, one can see that the mesh network has the best performance, realizing fast cooperation over the whole system using fewer ZD players. Although the performance of the real network is slightly lower than that of the mesh network, both have similar trends. This is because compared to other networks, the nodes in these two networks have higher degrees; thus fewer ZD players are needed to control more regular players.

Fig. 3 also shows that in the star network, once a ZD player is deployed, the average cooperation probability can rocket to and keep this level even though the number of ZD players increases. The reason behind this lies in that each node has only one neighbor except the central node in the star network, and a single ZD player is definitely needed in the central position to control all the other nodes while other ZD players completely lose their control power due to the lack of connections to other regular players. However, because of only one ZD player functions, the controllable range of the outsider’s expected utility shrinks to , making the cooperation probability to be . This is different from the situations in the mesh network and the real network, where ZD players can achieve higher control leverage by forming alliances and finally obtain the desired cooperation probabilities.

Finally, one can see from Fig. 3 that the growth rates of cooperation probabilities in the tree and the ring networks are the slowest. This is because the average degrees in the tree and ring networks are respectively 1.95 and 2, which are about and of those in the mesh and the real networks. Although the tree and ring networks have almost the same average degree as the star network, the ZD player located at the central place in the star network has the largest degree value, empowering its control.

Fig. 3: Average cooperation probabilities in different networks when .

Fig. 4 illustrates the average cooperation probabilities of our method under different numbers of ZD players () deployed in different networks when , which implies a nonlinear relationship between and . According to this figure, one can see that the results are similar to those shown in Fig. 4. Note that we have done extensive simulations under other forms of and obtained results with very similar trends; hence the corresponding results are omitted in this paper to avoid redundancy.

Fig. 4: Average cooperation probability in different networks when .

Figs. 5, 6, 7, 8, and 9 further demonstrate the performance of the proposed egoistic incentive mechanism for the mesh, real, star, tree, and ring networks, respectively, when , where is a coefficient. The subfigures (a) indicate the ratio of cooperators varying with and the number of ZD players in different networks while all subfigures (b) show the optimal deployments of ZD players in the corresponding networks when and . The red lines in the subfigures (b) identify the social ties (edges) of the ZD players.

We can clearly observe that the emergence of cooperation phenomenon happens in the mesh network from Fig. 5(a). That is, a large number of cooperators arise after more than 3 ZD players are deployed. When the number of ZD players is higher than 5, almost all regular players choose to cooperate. Fig. 5(b) indicates the positions of 10 ZD players. The statistical data reveals that these ZD players are popular nodes because their average degree is 51.7, compared to 39, the average degree of all nodes in the mesh network.

(a) Cooperator ratio
(b) ZD deployment
Fig. 5: Performance in the mesh network.

Fig. 6(a) demonstrates that the real network has similar performance as the mesh network. However, a further analysis on Fig. 6(b) indicates that the ZD players are not hot spots since their average degree is 16.8, which is lower than that of the whole network, i.e., 21.8. Contrarily, the ZD strategy is adopted by the nodes with high betweenness whose average value is 67.8888, almost twice as much as the average betweenness of the whole system. Here, the betweenness of a node is defined as the total number of times a node acts as a relay along the shortest path between any two other nodes. The reason behind this phenomenon may lie in that there exist sparse subnetworks in the real network that cannot be controlled by the popular nodes but can be regulated by nodes with a high betweenness value because betweenness can be employed to evaluate a node’s contribution to the network connectivity, quantifying to what extent this node can impact on others.

(a) Cooperator ratio
(b) ZD deployment
Fig. 6: Performance in the real network.

Fig. 7(a) indicates that there still exists an emergence of cooperation in the star network after deploying one ZD player. Just as we have expected, this ZD player is located at the central place indicated in Fig. 7(b), which has the highest popularity as well as connectivity. As we have analyzed earlier, even though we put more ZD players in other positions, they would not function since there exist no regular players as their neighbors, leading to low regulation power because no ZD alliance can be formed in this case. Hence, we just label the position of the single ZD player in Fig. 7(b).

(a) Cooperator ratio
(b) ZD deployment
Fig. 7: Performance in the star network.

According to Fig. 8(a), there is no emergence of cooperation phenomenon in the tree network. Moreover, the ratio of cooperators in the network rises steadily and relatively slowly due to the low average degree of the tree network. From subfigure (b), one can see that all ZD players are deployed in the nodes with degree being 3, higher than the root node and the leaf nodes whose degree values are respectively 2 and 1. Higher popularity makes ZD players possess more regular players as their neighbors, increasing their regulation range.

(a) Cooperator ratio
(b) ZD deployment
Fig. 8: Performance in the tree network.

Fig. 9(a) demonstrates that the performance of our mechanism in the ring network is similar to that in the tree network: no emergence of cooperation and low growth ratio of the number of cooperators. However, there exists a difference between them if we compare Fig. 8(a) with Fig. 9(a). Specifically, when the number of ZD players is smaller than 30, the tree network outperforms the ring network; but when there are more than 30 ZD players deployed, the ring network wins. The underlying reason can be found from Fig. 8(b) and Fig. 9(b). The cause of the first phenomenon lies in that each ZD player in the tree network has a degree of 3, which is higher than 2, the degree of a ZD player in the ring network; the second phenomenon happens because ZD alliances can be formed more easily in the ring network than in the tree network and when the number of ZD players increases, the power of ZD alliance can stimulate more cooperation. For example, as indicated by Fig.8(b) and Fig. 9(b), when 10 ZD players are deployed to shape the behaviors of others, no ZD alliance exists in the tree network while several ZD alliances are formed in the ring network. For instance, in the ring network, node 35 can ally with not only node 33 to regulate node 34, but also node 37 to control node 36.

(a) Cooperator ratio
(b) ZD deployment
Fig. 9: Performance in the ring network.

7 Conclusion

This paper proposes a mechanism to realize large-scale egoistic incentives via optimally deploying ZD players to reward cooperation and punish defection. We further derive a ZD alliance strategy for sequential multiple-player repeated games to speed up cooperation. Our approach has the traits of statelessness and stability, making it scalable and suitable for large-scale systems. The simulation results demonstrate that mesh and real networks can achieve the best performance where the emergence of cooperation phenomenon happens when only a few ZD players are deployed. Although such phenomenon also exists in a star network, its topological trait makes only one ZD player function, pinning the cooperation probability to be . The tree and ring networks perform the worst due to a low average degree. However, compared to the tree network, ZD alliances are more easily formed in the ring network, making it perform better as the number of ZD players increases.


The authors would like to thank the support from the National Natural Science Foundation of China under grants 61772080 and 61472044, and the National Science Foundation of the US under grants CNS-1704397 and IIS-1741279.


  • [1] K. Benjamin, G. S. Peter, and M. W. Feldman (2004) What is altruism?. Trends in Ecology Evolution 19 (3), pp. 135–140. Cited by: §3, §5.
  • [2] J. Bonnefon, A. Shariff, and I. Rahwan (2016) The social dilemma of autonomous vehicles. Science 352 (6293), pp. 1573–1576. Cited by: §1.
  • [3] K. Chen, H. Shen, and L. Yan (2015) Multicent: a multifunctional incentive scheme adaptive to diverse performance objectives for dtn routing. IEEE Transactions on Parallel and Distributed Systems 26 (6), pp. 1643–1653. Cited by: §1, §2.
  • [4] X. Chen, S. Wang, M. Liu, and C. Huang (2014) Partner-recruitment: incentive mechanism for content offloading. In Proceedings of the 2014 IEEE International Conference on Communications (ICC), Sydney, Australia, pp. 2538–2543. Cited by: §1, §2.
  • [5] R. M. Dawes, A. J. Van De Kragt, and J. M. Orbell (1988) Not me or thee but we: the importance of group identity in eliciting cooperation in dilemma situations: experimental manipulations. Acta Psychologica 68 (1), pp. 83–97. Cited by: §1, §3.
  • [6] R. M. Dawes (1980) Social dilemmas.. International Journal of Psychology 35 (2), pp. 111–116. Cited by: §1.
  • [7] C. Hilbe, B. Wu, A. Traulsen, and M. A. Nowak (2014) Cooperation and control in multiplayer social dilemmas. Proceedings of the National Academy of Sciences of the United States of America 111 (46), pp. 16425–16430. Cited by: §1, §2, §3, §4, §4.
  • [8] Y. Hu, M. Feng, L. N. Bhuyan, and V. Kalogeraki (2009) Budget-based self-optimized incentive search in unstructured p2p networks. In Proceedings of the 2009 IEEE International Conference on Computer Communications (INFOCOM), Rio de Janeiro, Brazil, pp. 352–360. Cited by: §1, §1, §2.
  • [9] I. Koutsopoulos (2013) Optimal incentive-driven design of participatory sensing systems. In Proceedings of the 2013 IEEE International Conference on Computer Communications (INFOCOM), Turin, Italy, pp. 1402–1410. Cited by: §1, §2.
  • [10] R. Lu, X. Lin, H. Zhu, X. Shen, and B. Preiss (2010) Pi: a practical incentive protocol for delay tolerant networks. IEEE Transactions on Wireless Communications 9 (4), pp. 1483–1493. Cited by: §1, §2.
  • [11] F. Mahmud, S. T. Zuhori, and T. C. Problem (2012) Genetic algorithm. LAP LAMBERT Academic Publishing. Cited by: §5.
  • [12] T. Ning, Z. Yang, H. Wu, and Z. Han (2013) Self-interest-driven incentives for ad dissemination in autonomous mobile social networks. In Proceedings of the 2013 IEEE International Conference on Computer Communications (INFOCOM), Turin, Italy, pp. 2310–2318. Cited by: §1, §2.
  • [13] Z. Ning, L. Liu, F. Xia, B. Jedari, I. Lee, and W. Zhang (2017) CAIS: a copy adjustable incentive scheme in community-based socially aware networking. IEEE Transactions on Vehicular Technology 66 (4), pp. 3406–3419. Cited by: §1, §2.
  • [14] L. Pan, D. Hao, Z. Rong, and T. Zhou (2015) Zero-determinant strategies in iterated public goods game. Scientific reports 5 (13096), pp. doi: 10.1038/srep13096. Cited by: §2, §3, §4.
  • [15] W. H. Press and F. J. Dyson (2012) Iterated prisoner’s dilemma contains strategies that dominate any evolutionary opponent. Proceedings of the National Academy of Sciences 109 (26), pp. 10409–10413. Cited by: §1, §2, §3, §4.
  • [16] J. Scott, R. Gass, J. Crowcroft, P. Hui, C. Diot, and A. Chaintreau (2009-05) CRAWDAD dataset cambridge/haggle (v. 2009-05-29). Note: Downloaded from External Links: Document Cited by: §6.
  • [17] W. Sun and S. Wang (2017) General analysis of incentive mechanisms for peer-to-peer transmissions: a quantum game perspective. In Proceedings of the 2017 IEEE International Conference on Distributed Computing Systems (ICDCS), Atlanta, USA, pp. 517–526. Cited by: §1.
  • [18] M. Y. S. Uddin, B. Godfrey, and T. Abdelzaher (2010) RELICS: in-network realization of incentives to combat selfishness in dtns. In Proceedings of the 2010 IEEE International Conference on Network Protocols (ICNP), Kyoto, Japan, pp. 203–212. Cited by: §1, §2.
  • [19] D. Yang, G. Xue, X. Fang, and J. Tang (2012) Crowdsourcing to smartphones: incentive mechanism design for mobile phone sensing. In Proceedings of the 2012 ACM International Conference on Mobile Computing and networking (MobiCom), Istanbul, Turkey, pp. 173–184. Cited by: §1, §2.