Coalitional Games with Stochastic Characteristic Functions and Private Types

10/25/2019 ∙ by Dengji Zhao, et al. ∙ 0

The research on coalitional games has focused on how to share the reward among a coalition such that players are incentivised to collaborate together. It assumes that the (deterministic or stochastic) characteristic function is known in advance. This paper studies a new setting (a task allocation problem) where the characteristic function is not known and it is controlled by some private information from the players. Hence, the challenge here is twofold: (i) incentivize players to reveal their private information truthfully, (ii) incentivize them to collaborate together. We show that existing reward distribution mechanisms or auctions cannot solve the challenge. Hence, we propose the very first mechanism for the problem from the perspective of both mechanism design and coalitional games.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

Introduction

Cooperative games have been used to model competitions between groups/coalitions of players [Davis and Maschler1965]. A cooperative game is defined by specifying a value for each coalition and it studies which coalitions will form and how they share the payoffs to enforce the collaboration. The value for each coalition represents the reward they can achieve together, e.g., finishing a task or building a social good. This value (the characteristic function) is often predefined and public, which can be either deterministic or stochastic. However, in real-world applications, what a coalition can achieve may depend on players’ capabilities, which is not necessarily known to everyone in advance.

In this paper, we study a cooperative game where the characteristic function is controlled by some private information owned by the coalition. Specifically, we study a task allocation problem, where a group of players collaborate to accomplish a sequence of tasks in order and there is a deadline to finish all the tasks. Because of the deadline, we need to find the best set of players to do the tasks. Each player’s capability is modelled by how much time she needs to finish a task. However, the finishing time is not fixed and it is a random variable following some distribution, which is only known to the player. Because of the uncertainty, the objective is to maximize the probability to meet the deadline 

[Frank1969]. In order to find the best set of players to meet the objective, we need to first know their private distributions (which then defines the probability to meet the deadline for each coalition). That is, the characteristic function is defined by the private distributions of the players in a coalition.

We model the problem as a coalitional game, but the value for each coalition (the characteristic function) is controlled by the players’ private information, which has never been investigated in the literature. On one hand, we want each player to tell their true private information to make the best decision, and on the other hand, we want the reward to be fairly distributed among the players. Cooperative games are often used to take care of the reward/cost distribution to enforce collaboration, while mechanism design is good at private information elicitation in a competitive environment [Nisan et al.2007]. We will show that our challenge cannot be solved by just using techniques from one side.

Thus, our goal is to design new reward sharing mechanisms such that players are incentivized to report their private distributions truthfully and the reward are distributed fairly among all players. To combat this problem, we propose a novel mechanism to solve both challenges using the techniques from both cooperative game theory and mechanism design. Our solution is based on a modified Shapley value which distributes the reward according to the players’ capabilities which incentivizes all players to reveal their true capabilities. The cost to achieve the goal is that the total reward distributed is more than what the coalition can generate. For comparison, we also studied a solution from a non-cooperative perspective by using Vickrey-Clarke-Groves (VCG) mechanism 

[Vickrey1961, Clarke1971, Groves1973]. VCG is very good at handling private information revelation, but the reward distribution is not as fair as the Shapley value.

Related Work

As closely related work, coalitional games with stochastic characteristic functions have been a rich line of research initiated by Charnes and Granot Charnes1976,Charnes1977. They assumed that the value for each coalition is a random variable following some known/public distributions, so the challenge is about how to promise a payoff for the grand coalition, such that the coalition is still enforceable and it does not over pay if the random outcome is not good. Their setting does not formally model where the randomness comes from. In our model, since each player has a random task completion time, together, we can say that the total completion time for a coalition is a random variable controlled by the players. We focus on how to incentivize the players to reveal their private random variable. Bachrach et al. Bachrach2012 studied the network reliability problem, where an edge has a probability to fail for connecting two points in a network, but the probability is public, which is essentially a special case of [Charnes and Granot1976, Charnes and Granot1977]. Their focus is to compute the Shapley value efficiently.

There are also many other studies on coalitional games with imcomplete information from different perspectives. Chalkiadakis and Boutilier Chalkiadakis2004,Chalkiadakis2012,Chalkiadakis2007 took a learning/bargaining approach to learn the types of the other players in a repeated game form, while we directly ask them to report their types. Li and Conitzer Li2015 studied the least core concept in a setting where the characteristic function has some well-defined noise (random effect), which is not controlled by any player. Ieong ad Shoham Ieong2008 considered the game where each player is uncertain about which game they are playing as a Bayesian game with information partition. On top of the stochastic characteristic function modelled by [Charnes and Granot1976], Suijs et al. Suijs1999,Suijs1999GEB extended the setting to the case where each coalition has some actions to choose and each action leads to a different random payoff. Myeson MYERSON2007 further considered that each player has multiple types with some distribution and each player also knows the type distributions of the others, then the paper investigated mechanisms with side payments to incentivize players to reveal their true types. However, when the players’ types are given, the payoff they can generate is not stochastic. Our model requires each player to report her private type (which is a distribution) and given their types, the value they can generate together is also stochastic.

Our solution has also applied type verification for their task execution time. Nisan2001 Nisan2001 studied a task scheduling problem where each agent first declares a time she needs to finish a task and later when the task is allocated and executed, the mechanism is able to verify how much time the agent actually took. By doing so, the mechanism can verify what the agent reported and pay the agent according to the execution, which is a direct verification of the agent’s report. In our setting, we cannot directly verify a player’s time distribution, but we are still able to see how much time the execution takes, which is a partial verification of the distribution. This partial verification has been applied first by porter2008fault porter2008fault in a single task allocation where each worker has a (private) probability to fail the task. Inspired by their solution, different extensions have been investigated [Ramchurn et al.2009, Stein et al.2011, Conitzer and Vidali2014, Zhao, Ramchurn, and Jennings2016]. All these settings are studied from a non-cooperative perspective and mostly for a single task, while our setting has multiple interdependent tasks and looks at the cooperative perspective.

The Model

We investigate a coalitional game where the characteristic function is not given in advance. It is defined by the capabilities of the players, and each player’s capability is a private information of the player. This setting exists in many real-world task allocation problems.

We consider herein a task allocation problem. The problem is of a project that consists of a sequence of different tasks to be finished in order, i.e. cannot be started until all tasks before have been finished. There is a deadline to finish the entire project. Finishing all tasks before the deadline generates a value , and finishing the tasks after the deadline does not generate any value.

There are agents (players) denoted by who can perform the tasks with different capabilities. Without loss of generality, we assume that each player is only capable of doing one of the tasks. Let be the set of players who can handle task . We have for all , for all , and .

For each player , her capability to handle task is measured by the execution time she needs to accomplish . We also consider that there is some uncertainty for player to accomplish the task. Therefore, player does not know the exact time she will need to finish , but she does know a duration distribution.

Studies on various task allocation problems assume private information of the players, regarding either probability of success [Porter et al.2008, Zhao, Ramchurn, and Jennings2016] or task duration [Stein et al.2011, Conitzer and Vidali2014]. Similarly, in our coalitional game the execution time distribution is

’s private information to determine the characteristic function. Let the discrete random variable

denote the execution time of player on task . We use to denote a realization of . Let be the probability mass function of , i.e., . There might be a cost for to execute task . We assume the cost is public and it can be ignored for the current analysis.

The goal of the above coalitional game is to find the optimal task allocation (one task can only be allocated to at most one player) such that the tasks can be finished before deadline with the highest probability. This would generate the highest expected value/reward for the players. That is, the characteristic function can be defined by

equals the highest probability that a group of players can finish all the tasks before .

It is evident that the definition of satisfies and the monotonicity property, i.e., for all . is the expected value the coalition can cooperate to achieve.

Since duration distributions are private and is not publicly known, we cannot easily achieve the goal and share the reward among the players with the standard techniques for coalitional games. The challenge here is that players can manipulate the game by misreporting their time distributions, which is not available in classical coalitional games.

The goal of this paper is to design new reward sharing mechanisms for the above game such that players are incentivized to report their time distributions truthfully.

The reward sharing mechanism requires each player to report her execution time distribution, but the player may not necessarily report her true distribution. For each player , let be the density function of her true distribution, and be her report. Let be the true density function profile of all players and be their report profile. We also denote by and by . Let be the density function space of and be the space of density function profile .

Definition 1.

A reward sharing mechanism is defined by , where defines the reward player receives given all players’ report profile.

To incentivize players to report their private information truthfully, a concept called incentive compatible has been defined in the literature of mechanism design. We apply the same concept here to define the incentives to report their time distributions truthfully. We say a reward sharing mechanism is incentive compatible if for each player reporting her time distribution truthfully is a dominant strategy.

Definition 2.

A reward sharing mechanism is incentive compatible if for all players , for all , for all , we have .

In the rest of the paper, we study incentive compatible reward sharing mechanisms.

The Failure of the Shapley Value

Shapley value is a well-known solution concept in cooperative game theory [Shapley1953]. It divides the reward among the players in a coalition according to their marginal contributions. It has many desirable properties such as efficiency (the reward is fully distributed to the players), symmetry (equal players receive equal rewards) and null player (dummy players receive no reward).

If we simply apply the Shapley value in our setting, the reward for each player is defined as:

(1)

where and are defined under the players’ report profile .

The Shapley value of a player is the average marginal contribution among all permutations of the players. In each permutation, player ’s marginal contribution is , where is the set of all players before in the permutation.

In this paper, we assume that in the grand coalition, for each task group, the player who is assigned the task has the highest Shapley value among the same task group. This induces some conditions on the time distributions, which is not clear what the exact conditions are. Intuitively, it implies if a player is better than the others in the same task group in the grand coalition, then it should also be better than them in sub-coalitions.

Let us consider a simple example:

Example 1.

There are two tasks to be finished before deadline and four players with and . Their execution time density functions are:

(2)
(3)

If all players report their density functions truthfully, then the allocation to maximize the probability to finish both and is to assign to player and to player . The probability is , i.e., .

The characteristic function in Example 1 is defined as: , , , , and the value for the rest coalitions is zero.

To compute the Shapley value for player , we list out all the permutations of the players and check player ’s marginal contribution in each permutation in the following table. The average marginal contribution of player among all permutations is .

order # permut. S
(1,{2,3,4}) 6 {} 0
(2,1,{3,4}) 2 {2} 0
(3,1,{2,4}) 2 {3} 9/16
(4,1,{2,3}) 2 {4} 3/16
({3,4},1,2) 2 {3,4} 9/16
({2,4},1,3) 2 {2,4} 2/16
({2,3},1,4) 2 {2,3} 6/16
({2,3,4},1) 6 {2,3,4} 6/16

Similarly, we get the Shapley value for all players:

player
1 47/192
2 7/192
3 47/192
4 7/192
Question 1.

In Example 1, only players and actually perform the tasks, so can we reward only players and according to their Shapley value?

The answer is no. For example, player could misreport a density function such that , in order to be selected for and receive a non-zero Shapley value. Therefore, we cannot exclude rewards to players who have not been assigned a task in the reward sharing mechanism.

Question 2.

In Example 1, can any of the players misreport to gain a higher Shapley value?

The answer is yes. For instance, player can misreport such that

(4)

If the other players still report truthfully, player ’s Shapley value under this misreport is , which is larger than the value under report . This kind of misreport applies to all players. The reason is that their Shapley value only depends on what they have reported, not what they can actually do.

In next section, we show how to modify the Shapley value to link it to the players’ true time distributions.

Truthful Shapley Value

As evident from Question 2, players can report a more promising execution time distribution to receive a higher Shapley value. This is because the Shapley value mechanism never verifies their reports. In reality, we could actually observe how much time a player has spent to accomplish her task. Therefore, we can pay them according to their execution outcomes. A similar approach has been applied in other task allocation settings by using auctions, especially VCG mechanism [Porter et al.2008, Ramchurn et al.2009, Conitzer and Vidali2014, Stein et al.2011, Zhao, Ramchurn, and Jennings2016].

Definition 3.

Given all players’ execution time distribution report profile , for each coalition , is the task assignment to define . means that has not been assigned to any player under coalition with reports .

Next are some specific notations for the new mechanism.

  • Let be the highest probability to finish all the tasks before the deadline under the report profile .

  • Let be the probability to finish all the tasks before the deadline given that the optimal task assignment is defined by but the actual probability to finish the tasks is calculated by . That is, uses to determine the optimal task assignment under coalition , but uses to recalculate the probability without changing the task assignment. In the mechanism, represents their reports and represents what we observed.

Shapley Value with Execution Verification (SEV)   Given all players’ report profile , for each player who has been assigned a task under , if her realised execution time is , then her Shapley value is updated as:
(5)
where represents the realization and is defined as:
for each player who has not been assigned any task in the assignment , her Shapley value stays the same (as we cannot observe ’s execution time):
(6)
Theorem 1.

The SEV mechanism is incentive compatible for all players who are assigned a task, but it is not incentive compatible for players who are not assigned a task.

Proof.

For each player who has been assigned a task, ’s reward varies according her execution outcomes, but her expected reward is . If reports her true distribution, i.e., , equals her Shapley value . Now if , can receive a larger ?

No matter what reports, is either assigned or not assigned the task. If is assigned the task when reports , then we show that her expected reward is maximized when reports . For the expected reward, we could look the expected reward (marginal contribution) in each player permutation. For each permutation, assume is the set of players before :

  • if has a non-zero marginal contribution, i.e., , it means that is assigned the task in the coalition . Then ’s expected marginal contribution would be , where is the probability to finish all the tasks when ’s execution time is fixed to without changing the task assignment defined by . If reports , then . If reports which dominates , then is still assigned the task, but stays the same (although in this case might be increased). If reports such that is not assigned the task under the coalition , then ’s marginal contribution becomes zero. Therefore, reporting truthfully maximizes ’s expected marginal contribution in this permutation.

  • if has a zero marginal contribution, i.e., , it means that is not assigned the task in the coalition . Thus, ’s expected marginal contribution in this case is zero. If reports such that is assigned the task under the coalition , then the expected probability would be less than . This is because takes a task from another player and makes the allocation non-optimal. Therefore, ’s expected marginal contribution in this case is negative. Hence, reporting also maximizes ’s expected marginal contribution in this case.

Since the expected marginal contribution in each permutation is maximized when reports truthfully, then her total expected reward is also maximized in this case.

For each player who is not assigned a task, we can easily find a counter example where can misreport to gain a higher reward. We can always find a setting and a permutation where ’s marginal contribution is zero. For this permutation, we can change ’s distribution such that ’s marginal contribution is greater than zero, but is still not assigned a task in the grand coalition . By doing so, the mechanism cannot observe ’s execution outcome and simply pays her the standard Shapley value which is greater than what she could have when she reports truthfully. ∎

As seen in Theorem 1, players who were not assigned a task can misreport to gain a higher reward under the SEV mechanism, because there is a lack of verification on their reports. It is certainly not ideal to assign each task to all its players to try, which is also not practical. Instead, we do the manipulation on behalf of the players to maximize their rewards they could gain. For each player who is not assigned the task, we treat this player as good as the player who is assigned the task to calculate her new Shapley value as her reward. This reward is the best the player could get by misreporting, and therefore, there is not incentive for misreporting anymore. The updated mechanism is defined as follows.

Shapley Value with Execution Verification and Bonus (SEVB)   Given all players’ report profile , for each player who is assigned a task under , if her realised execution time is , then her Shapley value is defined as the same as in SEV (Equation (5)), i.e., . for each player (assume ) who is not assigned the task in , her Shapley value is upgraded as
(7)
where and , i.e., is the player who is assigned .
Theorem 2.

The SEVB mechanism is incentive compatible.

Proof.

For each player who is assigned a task, as proved in Theorem 1, there is no incentive to misreport when the players who are not assigned a task are paid according to Equation 6. Now the unassigned players’ reward has upgraded to Equation 7, we need to prove that it is not better for to misreport such that is not assigned the task in the grand coalition.

Assume that is assigned task , if misreports such that is assigned to , then ’s reward will the upgraded Shapley value when we treat ’s report as identical as ’s report. Initially, we know that ’s report is not better than ’s (otherwise, would not be assigned). Now ’s misreport is not as good as ’s. We can prove that ’s reward under is not better than under . This can be proved from each single permutation.

  • If has a non-zero marginal contribution in a permutation when reports truthfully (assume are the players before ), when reports , is either assigned or not assigned the task. If is still assigned the task, then her marginal contribution is calculated as when treat as , which is clearly not better than reports truthfully. If is not assigned the task, then her marginal contribution becomes zero, which is again worse than reporting truthfully.

  • If has zero marginal contribution in a permutation when reports truthfully, then the contribution stays the same if she reports .

In summary, we get that player ’s reward is not increased when she misreports .

For player who is not assigned a task, assume belongs to task group and is assigned when reports :

  • if misreports such that is not dominating ’s report , then is still not assigned task and still receives the same reward.

  • if dominates , then will be assigned . We will prove that ’s expected reward under this case is not better than reporting .

When dominates , is assigned in the grand coalition and will execute task . For each permutation where ’s marginal contribution is non-zero under , then under , ’s marginal contribution can be either zero or non-zero. If it is zero, then her expected reward in this permutation will be negative, because is forced to change the optimal allocation, but could not increase the value. If it is non-zero, then her expected reward stays the same, as the reward is calculated according to her execution outcomes (her true ).

Together, we proved that cannot receive a better reward by misreporting. ∎

Since the SEVB mechanism may pay more than their actual Shapley values for players who did not receive a task, the total payment together might be greater than the total reward the grand coalition can gain. Theorem 3 shows that the extra payment is bounded.

Theorem 3.

Given any execution time distribution report profile , the total reward distributed under the SEVB mechanism is bounded by , where integer is a parameter to maximize the bound given and .

To prove Theorem 3, we first need to show some key properties of the Shapley value in our setting.

Figure 1: For proof of Lemma 1
Lemma 1.

Given that there are players each of whom can do one of the tasks and together can finish the tasks before the deadline with a probability , assume that player is assigned . Also consider that we add one more player for task , such that will not be assigned task when the players collaborate together. If player ’s Shapley value in the grand coalition is , then the Shapley value for is and the Shapley value for the other players is . Also, the total reward distributed under the SEVB mechanism for the players is maximized when is a dummy player, i.e., .

Proof.

Under the computation of the Shapley value, for each permutation of the players, if player ’s marginal contribution is , then player must be the last player and is the second last player in the permutation (otherwise, ’s marginal contribution is zero, see Figure 1 permutation ). If we simply switch player with any player before , under this new permutation, ’s marginal contribution is also (Figure 1 permutation ). Clearly, if we change the order of all players before , we get different permutations and in each permutation ’s marginal contribution is also . Similarly for player , we also get different permutations where ’s marginal contribution is .

Before the addition of player , for each permutation of the players, only the last player has a marginal contribution . Therefore, for each player , we have different permutations where is the last player with marginal contribution . We have in total permutations, hence ’s Shapley value is .

After adding player , for each original permutation of the players, we can insert in-between the players. We have possible insertions for , where each such insertion creates a new permutation, in which the marginal contribution of the last player (last according the original permutation) remains among all the new permutations (Figure 1 permutations and ). The addition of a player increases the total number of permutations from to , however, the number of permutations where the marginal contribution of each player is also increases from to . Therefore, the Shapley values of all players except (with players) are transferred to the setting of players. In addition, as depicted in Figure 1 permutations and , player also receives whatever player has in terms of Shapley value. Thus, if player ’s Shapley value is , then the same value is also added to player ’s Shapley value. Hence, the new Shapley value for player is . Since the sum of all the players’ Shapley value is , we get that player ’s Shapley value is .

Under the SEVB mechanism, all players except receive a reward equal to their Shapley value. For player , her reward under SEVB is . This is computed by assuming that has the same distribution as . Since and are identical, their Shapley value is the same. Now for each permutation, ’s marginal contribution is if and only if is the last and is the second last in the permutation. Thus, the total number of permutations where ’s marginal contribution is is , leading to a Shapley value of for players and . Hence, the total reward distributed under SEVB is

To simplify it, we get , which is maximized when . ∎

Lemma 2.

Given players for tasks, assume that there are players for task and one player for each of the other tasks and the players together can finish the tasks before the deadline with probability where is allocated to player . Then the total reward distributed under the SEVB mechanism is maximized when all players for are dummy players, i.e., their Shapley value under the grand coalition is zero.

Lemma 3.

Given players for tasks, assume that there are identical players for task , players for task (where all players except for one of them are dummy players), one player for each of the other tasks and the players together can finish the tasks before the deadline with a probability . Then the total reward distributed under the SEVB mechanism is .

The proofs of Lemma 2 and 3 are given in the appendix.

Proof of Theorem 3.

From Lemma 2, we know that when there is only one player for each task except for one task, then the worst case happens when players are dummy players for one task. The total reward distributed in this case is (set in Lemma 3). However, this is not necessarily the worst case in general. It is evident that if we simply move a dummy player from one task group to another task group, it will not change the dummy player’s reward under the SEVB mechanism (it is always ).

Interestingly, if we move a dummy player to another task group and make it non-dummy, it may increase the total reward distributed under the SEVB mechanism. Assume that initially all dummy players are for task . Now if we move a dummy from to . To make non-dummy, assume that the Shapley value for is (but it does not dominate the original player for , otherwise, will be increased and make the two settings not comparable). Hence, the total reward for the non-dummy players has reduced by due to the change of (property of the Shapley value). The reward for stays the same as . The key difference is the reward of the other dummy players for . Following the proof of Lemma 3, the new Shapley value for the other dummy players becomes , which has been increased by . Thus, the total reward increase after moving is

(8)

If , then the difference is maximised when is maximised, which happens when is identical to the original player for . Therefore, if moving a dummy player from to another task group increases the total reward, then the worst case happens when the dummy player is identical to the best player in the new group.

If we keep moving dummy players from to other groups, if this increases the total reward, then we can again prove that the worst case exists when the dummy player is identical to the best player in the new group. Moreover, following the analysis of Lemma 3, we can also prove that the reward stays the same even if the dummy players moved out from ’s group are not going to the same task group. Thus, we can keep moving dummy players to the same task to check if it increases the reward and stop when it does not increase the reward. When this stops, we reach the worst case setting, which is the setting given by Lemma 3.

The VCG with Verification

Our coalitional game setting can be also treated as an auction setting and apply standard auction mechanisms. This section shows that VCG can be adopted with execution verification to satisfy incentive compatibility in our setting.

VCG with Execution Verification (VCGEV)   Given all players’ report profile , for each player , if is assigned a task under under , and her realised execution time is , ’s reward is defined as:
(9)
where represents for and is defined as:
otherwise, ’s reward is .

VCG is often applied in a non-cooperative setting, so the reward for each player is her marginal contribution given that all the other players are already in the coalition. A player’s reward equals the marginal contribution in one permutation where the player is the last one joining the group. Therefore, if in the same task group, two good players are identical, then neither of them will be rewarded. On the other hand, if only one player can handle each task and the others are not capable, then each player who is assigned a task would get paid . Therefore, the reward distribution under VCG does not consider players’ contributions in a more thoughtful manner as the Shapley value does. Nonetheless, the VCG based mechanism is incentive compatible.

Theorem 4.

The VCGEV mechanism is incentive compatible.

Proof.

If a player is assigned a task, then her expected reward is . is the probability to finish all the tasks without ’s participation, which is independent of . The expectation of is , which equals if reports truthfully. It is evident that , ’s reward is non-negative when reporting truthfully.

Suppose that when misreports , we have . Since uses ’s true execution outcome to calculate the reward, the only thing that influences is the task assignment . If indeed assignment gives a higher expected probability to finish all the tasks, then the definition of should have chosen this assignment. However, we know is optimal, which is a contradiction.

If a player is not assigned a task, then her marginal contribution is zero. If she misreports to be assigned a task, her expected reward becomes negative (following the above analysis). ∎

Theorem 5.

Given any execution time distribution report profile , the total reward distributed under the VCGEV mechanism can be as small as zero and as large as .

Proof.

For each player who is assigned a task, if there is another identical player in the same task group, then the player’s reward under the VCGEV mechanism is zero. Therefore, in the worst case, all players receive zero reward.

In the other extreme case, where there is only one player for each task, then the reward for each player is , which is the maximal reward a player can get in any setting. Hence, the maximal total reward under the mechanism is . ∎

It is worth mentioning that there is another simple mechanism, which is also truthful and the total distributed reward is exactly what they have generated. The mechanism simply shares the reward equally among all players after they have executed the whole project. More specifically, each player receives if they finished the project before the deadline, otherwise receives zero. It is easy to prove that the mechanism is IC. However, the weakness is that it does not pay them according to their marginal contributions/capabilities.

Experiments

We evaluate the proposed mechanisms on many instances that differ in the task allocation settings, completion-time distributions, and the deadlines. We generate random distributions with support size , which is given as an input, such that the completion times for each player are in the range of [1…

] with uniform distribution.

Here we present the results for settings with , and tasks, and with different number of players. In theory, we have seen that the total reward bound under the SEVB mechanism is a kind of linear function of the number of players , whereas the higher bound of the VCGEV mechanism is independent of . In the experiments of tasks with players for each task group, Figure 1(a) shows the total distributed reward in proportion of for random instances. The results show that VCGEV varies a lot between settings, while SEVB is fairly stable. As in theory, the bound of SEVB is increasing as increases. We tested it on task settings with the number of players for each task varying from to . The results in Figure 1(b) show that for random instances the average proportion over is actually decreasing as increases (because more players for the same task reduce their Shapley value). A similar trend holds for the VCGEV mechanism (Figure 1(c)). We also examine settings with varying number of players for the different tasks. Figure 1(d) shows the percentage over on SEVB for tasks with , and player settings on instances. It indicates that when the players are more imbalanced between tasks, the total reward is reduced.

baseline

SEVB

VCGEV

(a) 4 tasks 4 players each

2

3

4

5

6

7

8

(b) 2 tasks for SEVB

2

3

4

5

6

7

8

(c) 2 tasks for VCGEV

[3, 3, 3]

[4, 3, 2]

[5, 3, 1]

(d) 3 tasks 9 players for SEVB
Figure 2: Experimental results for SEVB and VCGEV

Conclusion

We have studied a task allocation setting which merges information revelation challenge in mechanism design and the payoff distribution challenge in cooperative game theory. The two challenges cannot be solved by using existing techniques from just one of the two fields. We proposed the very fist attempt to solve the two challenges together. The solution guarantees that players will truthfully reveal their private information and the rewards they receive from the coalition are fairly distributed. The cost to achieve this is that we might need to distribute more rewards to the players than what they can achieve. However, the extra reward is bounded and the experiments showed that the extra reward is diminishing when more players are involved. We have not investigated the other solution concepts such as core in the current analysis [Aumann and Hart1992].

Appendix A Proofs

Proof of Lemma 2.

From Lemma 1, we know that when , this lemma holds. Now if , we need to add one more player for task . Let us call the three players for and and is the player assigned task and is a dummy player and is the newly added player. Since is a dummy player, its Shapley value is zero. If the Shapley value for is , then following the proof of Lemma 1, we get that the Shapley value for all players is and the Shapley value for is , which are also their reward distributed under the SEVB mechanism. The reward for under the SEVB mechanism is , but the reward for is (because creates competition for , which decreases ’s reward). Therefore, the total reward distributed under the SEVB mechanism is

which is . This is maximised when . Following this, we can prove that for any , the lemma holds. ∎

Proof of Lemma 3.

Let players for be and the non-dummy player for be . Let us compute the Shapley value for the player for task .

If , we have non-dummy players (one for each task), then marginal contribution for is only when is ordered in the last among the non-dummy players in a permutation (see Figure 2(a) permutation type , dummy players can be placed anywhere in the permutation). Hence, ’s Shapley value is .

If , for the permutation type , we can insert anywhere in the permutation without changing ’s marginal contribution. In addition, adding player will increase ’s marginal contribution for the permutation where is not ordered in the last (see Figure 2(a) permutation type , where is ordered in the last). Without , ’s marginal contribution under permutation type is zero. Because of , ’s marginal contribution becomes . Therefore, when , the new Shapley for is .

If , for the permutation types and , we insert anywhere without changing ’s marginal contribution, so the Shapley value under stays for . In addition, adding increases ’s marginal contribution from zero to for the permutation where is ordered before and (see Figure 2(a) permutation type ). If we ignore all the dummy players, among all the permutations, there type permutations. Hence, ’s Shapley value is increased by by adding .

Following the above analysis, for any , we get the Shapley value for is