1 Introduction
The increasing penetration of distributed energy resources (DER) poses challenges to distribution network operation. One of the most important topics recent researches have been focusing on is how to maintain the reliability of energy supply while encouraging distributed renewable generation, which is highly variable and intermittent, see Skea et al. (2007). Curtailment is applied in some networks with high renewable generation, see Jacobsen and Schroder (2012). However, it introduces inefficiency into the energy system and financially penalizes owners of renewable resources. Centralized control of DER is also proposed in various researches, but they tend to overlook the fact that prosumers, proactiveconsumers with distributed energy resources that actively control their energy behaviors, are independent entities who need incentives to participate in such a centralized control scheme, see Morstyn et al. (2018). The idea of setting up a peertopeer (P2P) energy sharing scheme is gaining tremendous attention both in the industry and academia in recent years, as it is considered a key market strategy to financially encourage efficient local management of DER, see Parag and Sovacool (2016).
A key feature of a P2P sharing scheme is its ability to use local flexibility to offset generation uncertainty. Local flexibility often takes the form of energy storage (ES), which can be modeled in a similar way as other types of flexible demand, see Sajjad et al. (2016). This paper makes the common assumption that the price to export energy to the energy network is lower than the price to import, see Zhou et al. (2018); hence, for a single prosumer, the benefit of flexibility is easily reflected in their energy bills when they increase the local usage of their own generation by optimally scheduling their ES. In a P2P market, more joint benefit can be reaped from matching local flexibility with variable generation among all the participants. At the same time, however, it becomes a challenge to allocate the benefit to each participant in an efficient and fair way.
Game theory has been adopted in some recent research to look at how to affect prosumer behavior using financial incentives. Dynamic pricing coupled with noncooperative game theory is one of the most popular topics, see Jia and Tong (2016), but it fails to demonstrate consistent benefit for every participant, see Han et al. (2019). Cooperative game theory is proposed as an alternative approach and is shown in Han et al. (2018) that financial rewards can be fairly allocated using the Shapley value, which is based on the contribution each prosumer makes to this joint scheme.
A player’s Shapley value in a cooperative game is a weighted average of their marginal contribution to all the possible coalitions among all players, see Shapley (1971). For an player game, there are possible coalitions, which means that the computation of Shapley value becomes intractable when increasing the number of players. The estimation of the Shapley value has been explored in some previous literature, sampling being the main methodology. An example is the cooperative scheme described in Chapman et al. (2017), but the model is constructed as simple games with a binary outcome representing whether a coalition of batteryowning households can overcome a hard network constraint. This scheme overlooks the contributions made by local generation or ES units that are not big enough to switch the binary outcome.
To improve the scalability of the P2P cooperative game proposed in Han et al. (2018), this paper identifies a random sampling method as a way to estimate the Shapley value. The method was proposed in Castro et al. (2009), and then modified in Castro et al. (2017) by adding a stratification step to the sampling, improving the accuracy of the estimation. This paper adapts this stratified sampling method by further creating coalitional strata for better performance in this specific application. The proposed sampling method enables the P2P game to scale up, and we are then able to analyze the impact of different DER adoption rates on prosumer profitability. Some interesting findings are shown in the case studies.
2 P2P Cooperative Game
In an player cooperative game, the grand coalition is defined as the group of all players. Any subset of the grand coalition is called a coalition. The basic framework of the P2P cooperative game proposed in Han et al. (2018) involves mainly three steps. Step 1 is to cooperatively manage DER within all coalitions, which requires optimally scheduling the ES units to minimize the coalitional energy cost, see Subsection 2.1. Step 2 is to quantify the value of forming each coalition, see Subsection 2.2. Step 3 is to divide the total energy cost savings from forming the grand coalition to all the players based on certain criteria, see Subsection 2.3.
2.1 Coalitional Energy Management
We index each prosumer by and the grand coalition by . If we consider timesteps () with a time interval of , the total energy cost of a coalition can be written as
where subscripts and are indices for the player and the timestep respectively. The known inputs are , , and , which are electricity import price (£/kWh), electricity export price (£/kWh), and net energy consumption (positive) or generation (negative) (kWh) without ES. The variables are : ES charge (positive) or discharge (negative) energy (kWh). We also define operation .
With the assumption , we can schedule all the ES units’ operation within coalition to minimize the coalitional energy cost , which is defined as
(1)  
(2)  
(3) 
where (1), (2), and (3) represent the ES power constraint, energy constraint, and cycle constraint respectively. We consider each prosumer ’s ES system has an energy capacity of , a charge limit of and a discharge limit of over the time span of , a charge efficiency of and a discharge efficiency of , and an initial state of charge of . For a prosumer who does not own an ES system, we set their energy capacity and charge/discharge limits all as zeros.
Fig. 1 demonstrates the effect of cooperative ES operation in a 16prosumer scenario. As shown in (d), the cooperative ES operation tends to flatten the load as it tries to match the consumption and generation within the coalition to minimize the coalition energy cost.
2.2 Value of Coalitions
The purpose of using cooperative game theory is to establish a framework to quantify the benefit of cooperation, and then to allocate the benefit to the participants efficiently. The coalitional energy cost provides a great metric to evaluate a coalition’s performance. Here, we define the value of a coalition as the energy cost savings obtained by forming the coalition. This is given by the difference between the sum of the energy costs incurred by each prosumer in when they schedule the ES systems individually, and the minimum coalitional energy cost of when the prosumers schedule their ES systems collectively:
By this definition, the value of the grand coalition becomes the total energy cost savings of a P2P cooperative game, which denotes the total amount of payoffs we can award to all the participants.
2.3 Prosumer Payoffs and Shapley Value
The second step in a cooperative game framework is the allocation of payoffs. We use vector
as the payoff allocation whose entry represents the payment to prosumer . One important payoff allocation is called the Shapley value denoted as , see Shapley (1971), representing each player’s weighted average marginal contribution to all possible coalitions within the game:(4) 
The Shapley value also satisfies the following axioms:

(Efficiency) . This requires the entirety of the value created by the grand coalition to be allocated to the players.

(Individual Rationality) . This ensures that no player is penalized for cooperating.

(Symmetry) If , then . This means that two players should be assigned the same Shapley value if they have the same marginal contributions to all the coalitions.

(Dummy Axiom) If , then . Therefore, a player’s Shapley value should be zero if they add zero marginal value to any of the coalitions.

(Additivity) If and
are characteristic functions, then
. This indicates that the Shapley value of two games played at the same time should be the sum of the two games’ Shapley values when played separately.
Axiom (1) guarantees that all the profits allocated to the prosumers add up to the total energy cost savings from the grand coalition. In our P2P cooperative game, , so Axiom (2) requires . Axiom (3) and (4) ensure the ‘fairness’ of the payoff allocation. Axiom (5) is not actively used in this paper as the P2P cooperative game is the only game discussed here.
The Shapley value offers a way to incentivize prosumers to participate in this cooperative scheme, improving the local energy supply reliability while encouraging the efficient use of distributed renewable generation. However, the scalability of the proposed model is very limited because the Shapley value’s computational time increases exponentially with the size of the grand coalition. The following section looks into a sampling method to estimate the Shapley value to reduce the model’s computational complexity.
3 Estimation of Shapley Value
The scalability of the P2P cooperative game model is mainly limited by the sheer number of cost minimization problems that are required to be solved. This number is equal to the number of possible coalitions, , where is the number of participating prosumers. Since the Shapley value is the weighted average of a player’s marginal contributions, sampling is identified as a promising estimation technique to be applied in our P2P cooperative game.
3.1 Stratified Random Sampling
The conventional definition is expressed in (4). Weber (1977) provided an alternative definition of the Shapley value expressed in terms of all possible orders of the players, which was then adopted by Castro et al. (2009) to develop a random sampling method to estimate the Shapley value. In this approach, is defined as the set of all possible permutations with player set , and as a permutation that assigns player to position . For a given , the set of predecessors of the player is denoted as , where if , . Player ’s marginal contribution is . The alternative definition of Shapley value can be written as
(5) 
Since and are equally weighted for all in (5), the Shapley value can be estimated using the unweighted expectation of given a set of randomly sampled permutations :
(6) 
The alternative definition of the Shapley value is just a special case where .
Using (6), the Shapley value can be estimated from randomly sampling player permutations, see Castro et al. (2009). To improve the estimation accuracy, Castro et al. (2017) proposed a stratified random sampling approach to divide the population of all player permutations into subpopulations that have the same size of predecessors for each player. This stratified random sampling method follows the following steps.

A stratum, or a stratified set of player permutations is defined as . Therefore, contains every permutation , in which player is in position . Player ’s mean marginal contribution of each stratum is
(7) 
A random permutation sample of size is obtained with replacement from each stratum .

Adapted from (7), player ’s mean marginal contribution of the samples from each stratum is
(8) The estimated Shapley value can then be calculated as .
We notice that in (8), are equally weighted for all . Because each player set (coalition) appears times for a given
, they have the same probability of being sampled from
into . We define the coalitional stratum as the set of coalitions , and . We then obtain a random sample with replacement from , and because the order of players does not matter in a coalition, can be considered a combination sample. (8) can be rewritten as(9) 
3.2 Modified Sampling with Optimal Sample Allocation
In order to implement the stratified random sampling method, a procedure to determine the sample size of each stratum needs to be established. Castro et al. (2009)
identified the true variance as a metric to allocate the samples among strata to minimize the estimation error, and proposed a twostage Shapley value estimation algorithm with optimal sample allocation. In the first stage, 50% of the samples are evenly distributed to each stratum to obtain an initial estimated Shapley value and each stratum’s sample variance. In the second stage, the remaining 50% of the samples are optimally allocated to each stratum in proportion to their sample variances calculated in the first stage. The final estimated Shapley value is then calculated using the sampling results from both stages.
We then recognize that , which means that evenly dividing the samples in the first stage could result in a sample size larger than the size of some coalitional strata: , especially when is close to or . Applying random sampling to obtain in these coalitional strata would take even more time and produce less accurate results than directly calculating :
(10) 
Using (9) and (10), we modify the twostage stratified random sampling method with optimal sample allocation to estimate the Shapley value. This modified method is detailed in Algorithm 1.
In Stage 1, in the case where the evenly distributed sample size is bigger than the stratum size, we compute the stratum’s precise mean marginal contribution , and add the saved samples to Stage 2. This way, we can improve the accuracy of the estimation both by using the precise stratum marginal contribution, and by increasing the number of samples for optimal allocation.
4 Case Studies
In the following two case studies, we implement the proposed sampling method to estimate the Shapley value of our P2P cooperative game. In the first case study, we select a range of prosumer numbers so we can compare the computational time of the estimated Shapley value and the actual Shapley value, and evaluate the accuracy of the estimation. In the second case study, we scale up the size of the game to evaluate the payoffs to the prosumers based on their DER types.
Some of the model inputs are as follows: the domestic load data was measured in the CustomerLed Network Revolution trials^{1}^{1}1http://www.networkrevolution.co.uk/resources/projectlibrary. the model time frame is 24 hours starting from the midnight of a sunny summer day in July. The PV systems are 4kW with fixed 20 degree tilt, simulated in PVWatts^{2}^{2}2http://pvwatts.nrel.gov/pvwatts.php using the London Gatwick solar data. The ES model has an energy capacity of 7 kWh, a maximum charge power of 3.5 kW, a maximum discharge power of 3.2 kW, both charge and discharge efficiencies of 95%, an initial state of charge of 50%, and a state of charge range of 2095%. The energy import price follows a UK Economy 7 residential rate structure: £0.072/kWh for midnight–7am, and £0.1681/kWh for 7am–midnight^{3}^{3}3https://www.gov.uk/government/statisticaldatasets/annualdomesticenergypricestatistics, and the energy export price is the UK feedin tariff^{4}^{4}4https://www.gov.uk/feedintariffs/overview fixed at £0.0485/kWh.
4.1 Validation of SamplingBased Shapley Estimation
In this case study, we fix the the PV and ES adoption rates both at 50%, and both ownerships are randomly assigned independently of each other. In other words, each prosumer can have a PV system, or an ES system, or both, or neither. We apply a range of prosumer numbers to compare the computation time between the full Shapley value calculation and the Shapley value estimation with the proposed sampling method.
Compared against each other are three models: 1) full Shapley value calculation, 2) Shapley value estimation using the proposed sampling method with samples per player, and 3) Shapley value estimation using the proposed sampling method with samples per player.
Table 1 shows the computation time^{5}^{5}5Running on Apple iMac with a processor of 2.8 GHz Intel Core i5 and a memory module of 16 GB 1867 MHz DDR3 of the three models. We only show computation times that are under 10 hours as we consider any time above 10 hours to be impractical for this application. As predicted, the full Shapley value calculation is shown to be intractable. When the number of players exceeds 16, the sampling method significantly reduces the computation time, and with the same number of players the computation time is largely in proportion to the number of samples specified.
No. players  8  12  16  20  30  50 

full model  25  466  1E+4  N/A  N/A  N/A 
samples/p  11  187  2E+3  6E+3  2E+4  N/A 
samples/p  10  104  221  741  2E+3  2E+4 
We then compare the model results for the 16player game, the largest game that can be computed for a full model within a reasonable time. From Fig. 2 we can see that the Shapley value estimation accuracy with samples per player is very high, whereas the estimation with is slightly less accurate. This confirms that there is a tradeoff between the computation time and the accuracy of the model when choosing the number of samples.
In order to understand this tradeoff better when the number of prosumer is further increased. We select a game of 30 prosumers and compare the estimated Shapley values with the two different sample sizes, and the results are plotted in Fig. 3. Even though the computation time for the samples/player model is about 10 times the samples/player model, see Table 1, the estimated Shapley values from the two models are very similar regardless of the type of resources owned by a prosumer. This gives us confidence in using a relatively low number of samples ( samples/player) to estimate the Shapley value of larger P2P cooperative games.
4.2 SamplingBased Shapley Value for Large Games
For a P2P cooperative game with 50 prosumers, we use 250 samples/player to ensure each coalitional stratus is sufficiently represented in the samples while keeping the computation within an acceptable time. First, we keep the number of PV and ES systems the same and vary their adoption rates together. Fig. 4
compares the estimated Shapley values by players’ DER ownership type, where each marker represents a prosumer’s estimated Shapley value. There are a few interesting observations. First, except for a few outliers, prosumers with the same DER ownership type are rewarded similar Shapley values regardless of the overall DER adoption rate. Second, as the DER adoption rates change, there is a significant shift in the Shapley values; when the DER adoption rates are low, PV owners are awarded significantly higher Shapley values likely because they provide cheaper energy to the coalitions, while when the DER adoption rates are high, pure consumers and ES owners are awarded higher Shapley values likely because they absorb more local generation. Third, as the DER adoption rates increase, the average Shapley values by DER ownership type tends to converge despite the wider spread among the pure consumers and prosumers with only ES systems.
With the same 50 prosumers, we then pick out four typical prosumers with different DER ownership types, and run the P2P cooperative model under four different scenarios: 1) PV adoption rate is fixed at 30%, and ES adoption rate varies from 10% to 50%, 2) PV adoption rate is fixed at 50%, and ES adoption rate varies from 10% to 50%, 3) ES adoption is rate fixed at 30%, and PV adoption rate varies from 10% to 50%, and 4) PV and ES are with the same adoption rate that varies from 10% to 50%. Fig. 5 illustrates how the Shapley value changes with different DER adoption rates. Based on the DER ownership type, the trend at which the Shapley value changes with the varying DER adoption rates can be very different. For example, a consumer that does not own any PV or ES tends to be awarded more when the adoption rates for the PV and ES increase together, whereas a prosumer that owns both PV and ES display the opposite trend. It is interesting to note that when the PV adoption rate is fixed, whether at 30% or 50%, varying the ES adoption rate has very little influence on the Shapley value regardless of the prosumer type. In contrast, whether the ES adoption rate is fixed at 30% or follows the PV adoption rate, varying the PV adoption rate has a significant impact on the Shapley value of all prosumer types.
It is worth noting that the main purpose of the case studies is to validate the scalability of the proposed sampling method applied in the P2P cooperative game. The specific results shown are dependent on the assumptions made about the PV, ES system specifications, and the energy prices. Further sensitivity analyses need to be conducted to generalize the results to other markets.
5 Conclusion
To improve the scalability of the P2P cooperative game (Han et al., 2018), this paper modifies a stratified random sampling method (Castro et al., 2017) to estimate the Shapley value. The maximum size of the game that can be computed in a reasonable time ( hours) is thus increased from less than 20 players to 50 players. Through case studies, the estimation errors are shown to be very small. The proposed model is then run on a P2P cooperative game of 50 players to demonstrate some interesting patterns and trends in the Shapley value for different prosumer DER ownership types and with varying DER adoption rates. Some future work includes sensitivity analyses on the PV and ES system inputs and electricity prices, and improving the sampling method to be able to further scale up the size of the P2P cooperative game.
References
 Castro et al. (2017) Castro, J., Gómez, D., Molina, E., and Tejada, J. (2017). Improving polynomial estimation of the Shapley value by stratified random sampling with optimum allocation. Computers and Operations Research, 82, 180–188. doi:10.1016/j.cor.2017.01.019.
 Castro et al. (2009) Castro, J., Gómez, D., and Tejada, J. (2009). Polynomial calculation of the Shapley value based on sampling. Computers and Operations Research, 36(5), 1726–1730. doi:10.1016/j.cor.2008.04.004.
 Chapman et al. (2017) Chapman, A.C., Mhanna, S., and Verbič, G. (2017). Cooperative Game Theory for Nonlinear Pricing of Loadside Distribution Network Support. The 3rd IJCAI Algorithmic Game Theory Workshop.
 Han et al. (2018) Han, L., Morstyn, T., and McCulloch, M. (2018). Constructing prosumer coalitions for energy cost savings using cooperative game theory. In 2018 Power Systems Computation Conference (PSCC), 1–7. doi:10.23919/PSCC.2018.8443054.
 Han et al. (2019) Han, L., Morstyn, T., and McCulloch, M. (2019). Incentivizing prosumer coalitions with energy management using cooperative game theory. IEEE Transactions on Power Systems, 34(1), 303–313. doi:10.1109/TPWRS.2018.2858540.
 Jacobsen and Schroder (2012) Jacobsen, H.K. and Schroder, S.T. (2012). Curtailment of renewable generation: Economic optimality and incentives. Energy Policy, 49, 663 – 675. doi:https://doi.org/10.1016/j.enpol.2012.07.004. Special Section: Fuel Poverty Comes of Age: Commemorating 21 Years of Research and Policy.
 Jia and Tong (2016) Jia, L. and Tong, L. (2016). Dynamic pricing and distributed energy management for demand response. IEEE Transactions on Smart Grid, 7(2), 1128–1136. doi:10.1109/TSG.2016.2515641.
 Morstyn et al. (2018) Morstyn, T., Hredzak, B., Aguilera, R.P., and Agelidis, V.G. (2018). Model predictive control for distributed microgrid battery energy storage systems. IEEE Transactions on Control Systems Technology, 26(3), 1107–1114. doi:10.1109/TCST.2017.2699159.
 Parag and Sovacool (2016) Parag, Y. and Sovacool, B.K. (2016). Electricity market design for the prosumer era. Nature Energy, (March), 16032. doi:10.1038/nenergy.2016.32.
 Sajjad et al. (2016) Sajjad, I.A., Chicco, G., and Napoli, R. (2016). Definitions of demand flexibility for aggregate residential loads. IEEE Transactions on Smart Grid, 7(6), 2633–2643. doi:10.1109/TSG.2016.2522961.
 Shapley (1971) Shapley, L.S. (1971). Cores of convex games. International Journal of Game Theory, 1(1), 11–26. doi:10.1007/BF01753431.
 Skea et al. (2007) Skea, J., Anderson, D., Green, T., Gross, R., Heptonstall, P., and Leach, M. (2007). Intermittent renewable generation and the cost of maintaining power system reliability. Generation, Transmission & Distribution, IET, 1(2), 324. doi:10.1049/ietgtd.
 Weber (1977) Weber, R. (1977). Probabilistic values for games. Cowles Foundation Discussion Papers 471R, Cowles Foundation for Research in Economics, Yale University.
 Zhou et al. (2018) Zhou, Y., Wu, J., and Long, C. (2018). Evaluation of peertopeer energy sharing mechanisms based on a multiagent simulation framework. Applied Energy, 222(February), 993–1022. doi:10.1016/j.apenergy.2018.02.089.