## 1 Introduction

Cooperation among self-interested actors is a widespread phenomenon that bridges several otherwise disjunct disciplines Axelrod (1984); Ostrom (1990); Sigmund (2010); Bowles and Gintis (2011); Nowak and Highfield (2011). This seemingly paradoxical behavior, where individual and collective interests are in conflict, provides a major challenge for our century Pennisi (2005). Not surprisingly, an armada of scientists with different backgrounds are trying to identify decisive factors which may explain the evolutionary success of altruistic choice Szabó and Fáth (2007); Archetti and Scheuring (2012); Fu and Wang (2008); Chen et al. (2013); Perc and Szolnoki (2010); Chen et al. (2008); Takesue et al. (2017); Wu et al. (2017); Perc et al. (2017). Albeit a clear and intuitive taxonomy of potential cooperation supporting mechanisms was already given by Martin Nowak Nowak (2006), but this framework can mostly be considered as an inspiring starting point for subsequent research efforts.

One of the main research paths reveals the possible diverse consequences of different strategy updating rules on the evolution of competing strategies Ohtsuki and Nowak (2006); Roca et al. (2009); Zukewich et al. (2013); Hindersin and Traulsen (2015). When imitation is used, which is the most frequently applied strategy updating rule Szabó and Tőke (1998), it turned out that the accompanying individual features, such as strategy learning or teaching capacity, could be a decisive factor if we assume a heterogeneous population where actors may differ from each other Szolnoki and Szabó (2007). Even if we assume diverse actors the consequences of varying learning or varying teaching capacities are largely different. While diverse learning activity has no particular role on cooperation level the possibility of unequal teaching activity allows network reciprocity to be augmented in a similar way that was observed previously for largely heterogeneous interaction graphs Santos and Pacheco (2005). More precisely, a player having large strategy teaching capacity is able to enforce her strategy to her local neighborhood Szolnoki et al. (2008). In this way locally coordinated homogeneous spots emerge which reveals the advantage of mutual cooperation. It is crucial to stress, however, that the benefit of individual strategy teaching capacity is only visible if players are heterogeneous, but disappears if players are uniform and bear identical dynamical features. In the latter case the final evolutionary outcome is independent of the proper value of the uniformly applied teaching or learning activity.

In this work we explore weather the individual dynamical features of actors, like strategy learning or strategy teaching capacities, play any role on the evolution of cooperation when there are different ways to update their strategies. For this purpose we assume that both imitation and Death-Birth (DB) updating rules are available and we also suppose that the mentioned dynamical features of players may change individually Ohtsuki and Nowak (2006). In particular, we assume that players may have lower or higher strategy learning capacities which may change during the evolutionary process. In an alternative setup we assume distinct strategy teaching activities and clarify whether they affect the evolutionary outcome when both imitation and Death-Birth rules are present. We stress that both mentioned updating protocols assume only information about local neighborhood hence their alternative use does not cause fundamental differences. The latter would not be hold for those updating rules which use global information about the entire population. For example, when Birth-Death (BD) updating is used a player is chosen for reproduction from the entire population proportional to fitness. In a similar way, global information, the average level of fitness, is necessary when replicator dynamics is applied. To avoid incomparable features of updating protocols we only use imitation and DB rule.

We note that the simultaneous use of different strategy updating rules can be introduced in two basically distinct ways. According to the first scenario, which is conceptually similar to an annealed randomness, a player may use either imitation or can be the subject to a Death-Birth process, but these updating protocols are used with a specific probability. In the other case, which resembles quenched randomness, a player uses one of the mentioned updating rules exclusively, but the fraction of those players who belongs to a specific set is well-defined. We note that considering an adaptive population where players playing a skill game was reported in

Javarone (2015).In the following we will show that the way how we mix the updating rules can be a decisive factor on the resulting cooperation level. Interestingly, the level of strategy learning capacity is more important factor than the strength of teaching capacity, which is against our previous experience when only a single strategy updating protocol was present Szolnoki and Szabó (2007).

The remainder of this paper is organized as follows. In the next Section we describe the model and survey the possible versions of the proposed evolutionary games. Section 3 is devoted to the presentation of our observations. Finally, we summarize the main conclusions and discuss their potential implications in Section 4.

## 2 Model and Method

We consider a population of individuals who play the so-called weak prisoner’s dilemma game on a graph Nowak and May (1992). In this simplified version, which still captures the essence of a social dilemma, we only have a single parameter that characterizes the strength of the dilemma. Initially each player on site is designated as either a cooperator () or a defector () with equal probability. While mutual cooperation yields the reward to both cooperator players, mutual defection results in zero payoff to the partners. The same zero payoff goes to a cooperator who interacts with a defector, while the latter collects the temptation value, which establishes that defection is the preferred individual choice.

For simplicity we present our results obtained on a square lattice with periodic boundary conditions, but we stress that qualitatively similar behavior can be found by using other types of topologies including regular and heterogeneous random graphs Watts and Strogatz (1998); Szabó et al. (2004). According to the applied interaction topology, when calculating the payoff of a player then we accumulate the payoff values obtained from the pair interactions with all neighbors.

In every Monte Carlo step in average all players have a chance to update their strategies. During the strategy updating protocol we use Death-Birth process with probability and imitation rule with probability . In the former case a randomly chosen individual is removed and her neighbors compete for the empty site proportional to their fitness. In the alternative case, which happens with probability , the imitation rule is considered. Accordingly, the randomly selected player , having strategy , imitates the strategy of a neighboring player with a probability . Here denotes the accumulated payoffs of both players while parameter quantifies the uncertainty of strategy adoptions Szabó and Tőke (1998). To gain results comparable to previous studies we apply , but our observations remain intact for wide range of noise interval.

It is crucial to note that the above mentioned strategy updating rules use local selection, which is a fundamental feature when spatially structured population is considered. In other words, we don’t need global information, i.e. to know the accurate states of all players in the whole population, when a microscopic strategy update is executed.

The above specified mixture of strategy updating rules resembles annealed randomness in statistical physics Landau and Binder (2000). We also introduce an alternative way how to apply Death-Birth and imitation rules simultaneously. In the latter case, which is conceptually similar to quenched randomness in solid-state physics, a specific player always uses one of these rules. But the fraction of those sites where Death-Birth is applied is , while the remaining portion of the population use imitation to update their states.

Since our principal interest is to clarify whether an individual dynamical feature influences the final outcome we also introduce a certain trait which determines the success of the imitation process locally. For example, we can assume that players have different abilities to learn from their neighbors, which can be described by a parameter. Consequently, when a player imitates a player then we assume a modified imitation probability that is . For simplicity, we assume that two different values, and , are available in the initial state and these individual learning capacities are also adopted during the imitation process. The key questions are whether different individual traits coexist in the stationary state and how the specific value of influences the cooperation level. We note that the value of has no any significance in the classical model where all players use the same value and individual state can be varied via imitation only.

Evidently, individual strategy teaching capacity can also be introduced. In this case when a player imitates a player then we assume a modified imitation probability that is . Again, for simplicity, we assume that two different values, and , are available in the initial state and these individual teaching capacities are also adopted during the imitation process.

To summarize the model definition, practically we study four fundamentally different setups. In the first case we assume cooperator and defector players who may have different learning capacities and the microscopic updating process is executed via an annealed mixture of Death-Birth and imitation rules. In the second case we use quenched mixture of updating rules for the same players. Thirdly, we assume players who have different teaching activities with annealed mixture of updating rules. Finally, the players with heterogeneous teaching capacities are updated by a quenched mixture of microscopic protocols.

## 3 Results

We start with the presentation of key findings for the first two cases. These results are summarized in the phase diagrams plotted in Fig. 1. The control parameters are the temptation value and the probability that characterizes the weight of Death-Birth update in the microscopic dynamics. As we noted, here four different microscopic states compete for space during the coevolutionary process. They are cooperators with low () and high () learning capacities and defectors with low () and high () imitation skill.

It is a mutual feature of both phase diagrams that players with high learning activity cannot survive in the stationary states. The only exception is at high – low values where defectors die out first leaving behind cooperator players with different learning capacities. We marked this uniform-strategy state by “” where learning capacity of players becomes irrelevant. The system arrives to a similar uniform-strategy state at low – high parameter values where cooperators die out first and defectors with different learning skills remained alive (this state is marked by “” on the diagrams).

When the updating rules are mixed in an annealed way, shown in the left panel of Fig. 1, then the positive effect of the Death-Birth updating rule can be observed already at small values. Here the full cooperator state can be easily reached even at relatively high values. Interestingly, marks a phase where players with high learning capacities die out first that is followed by players and finally players prevail exclusively. If we increase the temptation at a fixed value then the system terminates into a mixed state where and players coexist. Nevertheless, the most striking feature of the diagram is that the influence of Death-Birth process emerges very early even at small values if we initially allow players to be present with different learning capacities.

The above described effect is completely missing if we mix the updating rules in a quenched way. This case is summarized in the right panel of Fig. 1. Here the positive consequence of the presence of Death-Birth rule emerges only at high values where those players who follow this protocol percolate, hence they can support each other mutually.

To gain a deeper insight about the microscopic mechanisms which govern the pattern formation we present a characteristic spatial evolution of the four competing states for both cases. In particular, we show the case of annealed mixture of updating rules, where some representative snapshots are plotted in Fig. 2. The first panel of Fig. 2 shows a prepared initial state where we separated players with different learning capacities in different sides of the available space. More precisely, players having high learning capacities () are separated on the left side. They are defectors (dark red) and cooperator players (dark blue). Initially, on the right side are those who have low, , learning capacities. Here defectors are denoted by light red, while cooperators by light blue. When the evolution is launched the two subsystems evolve very differently. While players invade players efficiently, resulting a homogeneous dark red domain, and players coexist on the right side. Simultaneously, light blue players start invading opponents as it is shown in panel (b). It will result in the extinction of state, shown in panel (c), but the homogeneous domain is unstable due to the high and low value. The final stable state is shown in panel (d) where low learning activity defectors and cooperators coexist. (To monitor the whole evolution we provided an animation that can be seen in learn .) This example illustrates nicely that the individual dynamical feature, viz. learning capacity, can play a decisive role on the stationary state even if it is uniform for all players. The important condition is the presence of multiple updating rules: while cooperators cannot survive when players possess high learning capacities, they coexist with defectors if is low, no matter we applied a significantly strong temptation value.

The above described mechanism cannot work for quenched mixture of updating rules because the positive consequence of Death-Birth rule is inefficient when the sites which use this protocol are rare. If they cannot percolate then they are surrounded by sites where only imitation is used. The latter updating rule provides a significantly modest cooperation level and cooperators die out at every reasonable value. It simply means that only defectors will compete for the empty place where Death-Birth rule is used, hence the total failure of cooperation is inevitable. This situation can only change when Death-Birth places are dense enough to percolate. Above the percolation threshold Landau and Binder (2000) these sites become neighboring and their neighborhood should not necessarily follow the evolutionary trajectory dictated by pure imitation dynamics. In agreement with this argument, the right panel of Fig. 1 illustrates nicely that cooperation can only maintain in the high regime.

Put differently, the quenched mixture of updating rules does not provide any synergistic effect and practically we have two “subsystems” where either imitation or Death-Birth-type updating rule is functioning. (We should not forget that the latter is functioning properly only at high values.) Since we practically have “subsystems” using a single-updating rule, therefore the individual dynamical feature becomes unimportant again. As we already mentioned in the introduction, in the traditional model, where only imitation is used, the actual value of plays no any role if all actors possess the same trait Szolnoki and Szabó (2007). According to this argument we should obtain the same evolutionary outcome independently of the applied value if quenched mixture of updating rules is applied. This conjecture is nicely confirmed in Fig. 3 where we plotted the cooperation level in dependence of lower teaching activity both for annealed (left) and quenched (right) randomness. As we previously stressed, in case of annealed randomness this dynamical feature has an important role on the final outcome and by changing only this parameter we can span from a full defection to a full cooperation state. Furthermore, this sensitivity on value remains valid independently of the applied value. But these features, as we argued above, disappear for the quenched mixing case. In the latter case only the value of counts. This is illustrated nicely on the right panel where we obtained higher cooperation level at higher temptation value for larger .

In the rest of this paper we discuss the cases when heterogeneous strategy teaching activity is assumed. This dynamical feature was proved to be relevant in previous studies Szolnoki and Szabó (2007); Szolnoki et al. (2008); Szolnoki and Perc (2008). Now, similarly to the learning cases, four different microscopic states compete for space during the coevolutionary process. They are cooperators with low () and high () teaching capacities and defectors with low () and high () convincing skill. The key observations are summarized in Fig. 4 for both cases.

In case of annealed mixture of strategies, shown in the left panel, we can see that the previously reported spread of full cooperator state on the plane is limited. There are still full cooperator state in the high – low corner, which is composed by and players, but the majority of parameter space is dominated by full defector phase. Between them, in the mixed phase, cooperators and defectors with high teaching ability coexist.

Based on our previous experience with learning activities, the case of quenched mixture, plotted on the right panel of Fig.4, is less surprising. As previously, we can detect nonzero cooperation level only at high and low values.

To understand better the microscopic mechanisms which are responsible for these outcomes we present a series of snapshots obtained from an evolutionary process of annealed mixing in Fig. 5. Similar to the previous demonstration we use a prepared initial state again where players with high teaching activity are distributed randomly on the left side of panel (a). They are (dark red) and (dark blue) players. On the right side of the starting panel players with low teaching capacities are present. They are (light blue) and (light red) players. Starting evolution from these random states we can observe that both pairs of cooperator states coexist with its own defector partner at this parameter values. Interestingly, the combination of provides a larger cooperation level than the combination of states. This is visible on panel (b) where the former domain is almost blue while the latter is mostly red. This is the manifestation of the dynamic-sensitive cooperation we already reported when suppressed learning activities were used.

Rather unexpectedly, however, the solution with lower general cooperation level is more stable and gradually invades the other solution. Technically, it happens via the invasion of state which beats players who will die out first, as it is illustrated in panel (c). The remaining spots are invaded by players and the system finally evolves into a state where and players coexist by giving a modest cooperation level. (The whole evolution can be followed in the animation we provided in teach .)

The above presented pattern formation explains why we obtain significantly less average cooperation in the left panel of Fig. 4 comparing to the left panel of Fig. 1. Albeit a higher cooperation level would be available by using players with lower teaching ability but they are vulnerable against those who have higher teaching activity. This will result in the latter group’s victory with a moderate cooperation level.

The comparison of the left panels of Fig. 1 and Fig. 4(b) illustrates nicely our previous conclusion obtained at Fig.3. More precisely, the mentioned phase diagrams for quenched randomness are practically identical, highlighting that the individual values of dynamical features have no relevant importance on the final outcome when updating rules are mixed in a quenched way. In the latter case the only decisive factor is the fraction of those sites where Death-Birth rule is applied: if this portion is above the percolation threshold then cooperators can survive at not too large values. Otherwise, when this portion is below this threshold then the system behavior is practically identical to the classic model when imitation is used by all players Szabó et al. (2005).

## 4 Discussion

In this work we have explored the possible impact of multiple strategy updating rules on the cooperation level when players with different individual dynamical traits are present. The latter can be strategy learning or teaching capacities which determine the success of a microscopic imitation process. In a uniform system, where all players possess the same trait, these dynamical features have no relevant impact on the stationary state that is obtained as the final destination of an evolutionary process. Our work highlighted that this picture is inaccurate when different updating rules are present because the cooperation level may depend sensitively on the dynamical details. Interestingly, the individual learning skill of players are more important than the strategy teaching capacity and by varying the value of the former parameter we can reach a full cooperator state where payoff values would dictate a full defector state otherwise.

We also pointed out the way how to mix updating rules is also important. When a player can apply both imitation and Death-Birth rule via an annealed-like mixing then the above mentioned synergy can be observed. But this phenomenon is completely missing when the simultaneous presence of updating rules is realized in a way when different players use different rules permanently. The latter mix resembles a quenched randomness where we observe the sum of simpler subsystems using a singular updating rule. As a consequence, the system behavior is very similar to those we can see for traditional single-rule models. We stress that our observations about structured populations are robust and remain valid for non-regular interaction graphs as well.

It is worth noting that the simultaneous application of different updating rules within a single system is a recently opened direction Danku et al. (2018); Amaral and Javarone (2018) which may help to answer the long-standing debate whether which rule is evolutionary viable and which one captures the key element of a realistic system Rand et al. (2011); Gracia-Lázaro et al. (2012); Rand et al. (2014). Our work warrants that several microscopic details can be important while other previously decisive conditions become irrelevant when multiple rules govern the evolution. We hope that our work will be a stimulating step for future research works along this path.

This research was supported by the Hungarian National Research Fund (Grant K-120785).

## References

- Axelrod (1984) R. Axelrod, The Evolution of Cooperation, Basic Books, New York, 1984.
- Ostrom (1990) E. Ostrom, Governing the commons: The Evolution of Institutions for Collective Action, Cambridge University Press, Cambridge, U.K., 1990.
- Sigmund (2010) K. Sigmund, The Calculus of Selfishness, Princeton University Press, Princeton, NJ, 2010.
- Bowles and Gintis (2011) S. Bowles, H. Gintis, A Cooperative Species: Human Reciprocity and Its Evolution, Princeton University Press, Princeton, NJ, 2011.
- Nowak and Highfield (2011) M. A. Nowak, R. Highfield, SuperCooperators: Altruism, Evolution, and Why We Need Each Other to Succeed, Free Press, New York, 2011.
- Pennisi (2005) E. Pennisi, How did cooperative behavior evolve, Science 309 (2005) 93–93.
- Szabó and Fáth (2007) G. Szabó, G. Fáth, Evolutionary games on graphs, Phys. Rep. 446 (2007) 97–216.
- Archetti and Scheuring (2012) M. Archetti, I. Scheuring, Review: Game Theory of Public Goods in One-Shot Social Dilemmas without Assortment, J. Theor. Biol. 299 (2012) 9–20.
- Fu and Wang (2008) F. Fu, L. Wang, Coevolutionary dynamics of opinions and networks: From diversity to uniformity, Phys. Rev. E 78 (2008) 016104.
- Chen et al. (2013) X. Chen, T. Gross, U. Dieckmann, Shared rewarding overcomes defection traps in generalized volunteer’s dilemmas, J. Theor. Biol. 335 (2013) 13–21.
- Perc and Szolnoki (2010) M. Perc, A. Szolnoki, Coevolutionary games – a mini review, BioSystems 99 (2010) 109–125.
- Chen et al. (2008) X.-J. Chen, F. Fu, L. Wang, Interaction stochasticity supports cooperation in spatial Prisoner’s dilemma, Phys. Rev. E 78 (2008) 051120.
- Takesue et al. (2017) H. Takesue, A. Ozawa, S. Morikawa, Evolution of favoritism and group fairness in a co-evolving three-person ultimatum game, EPL 118 (2017) 48002.
- Wu et al. (2017) T. Wu, L. Wang, F. Fu, Coevolutionary dynamics of phenotypic diversity and contingent cooperation, PLoS Comput. Biol. 13 (2017) e1005363.
- Perc et al. (2017) M. Perc, J. J. Jordan, D. G. Rand, Z. Wang, S. Boccaletti, A. Szolnoki, Statistical physics of human cooperation, Phys. Rep. 687 (2017) 1–51.
- Nowak (2006) M. A. Nowak, Five Rules for the Evolution of Cooperation, Science 314 (2006) 1560–1563.
- Ohtsuki and Nowak (2006) H. Ohtsuki, M. A. Nowak, The replicator equation on graphs, J. Theor. Biol. 243 (2006) 86–97.
- Javarone (2015) M. A. Javarone, Poker as a skill game: rational versus irrational behaviors, J. Stat. Mech. 2015 (2015) P03018.
- Roca et al. (2009) C. P. Roca, J. A. Cuesta, A. Sánchez, Evolutionary game theory: Temporal and spatial effects beyond replicator dynamics, Phys. Life Rev. 6 (2009) 208–249.
- Zukewich et al. (2013) J. Zukewich, V. Kurella, M. Doebeli, C. Hauert, Consolidating Birth-Death and Death-Birth Processes in Structured Populations, PLoS ONE 8 (2013) e54639.
- Hindersin and Traulsen (2015) L. Hindersin, A. Traulsen, Most Undirected Random Graphs Are Amplifiers of Selection for Birth-Death Dynamics, but Suppressors of Selection for Death-Birth Dynamics, PLoS Comput. Biol. 11 (2015) e1004437.
- Szabó and Tőke (1998) G. Szabó, C. Tőke, Evolutionary prisoner’s dilemma game on a square lattice, Phys. Rev. E 58 (1998) 69–73.
- Szolnoki and Szabó (2007) A. Szolnoki, G. Szabó, Cooperation enhanced by inhomogeneous activity of teaching for evolutionary Prisoner’s Dilemma games, EPL 77 (2007) 30004.
- Santos and Pacheco (2005) F. C. Santos, J. M. Pacheco, Scale-free networks provide a unifying framework for the emergence of cooperation, Phys. Rev. Lett. 95 (2005) 098104.
- Szolnoki et al. (2008) A. Szolnoki, M. Perc, G. Szabó, Diversity of reproduction rate support cooperation in the prisoner’s dilemma game in complex networks, Eur. Phys. J. B 61 (2008) 505–509.
- Nowak and May (1992) M. A. Nowak, R. M. May, Evolutionary Games and Spatial Chaos, Nature 359 (1992) 826–829.
- Watts and Strogatz (1998) D. J. Watts, S. H. Strogatz, Collective dynamics of ’small world’ networks, Nature 393 (1998) 440–442.
- Szabó et al. (2004) G. Szabó, A. Szolnoki, R. Izsák, Rock-scissors-paper game on regular small-world networks, J. Phys. A: Math. Gen. 37 (2004) 2599–2609.
- Landau and Binder (2000) D. Landau, K. Binder, A Guide to Monte Carlo Simulations in Statistical Physics, Cambridge University Press, Cambridge, 2000.
- () https://doi.org/10.6084/m9.figshare.6097046
- Szolnoki and Perc (2008) A. Szolnoki, M. Perc, Coevolution of teaching activity promotes cooperation, New J. Phys. 10 (2008) 043036.
- () https://doi.org/10.6084/m9.figshare.6097223
- Szabó et al. (2005) G. Szabó, J. Vukov, A. Szolnoki, Phase diagrams for an evolutionary prisoner’s dilemma game on two-dimensional lattices, Phys. Rev. E 72 (2005) 047107.
- Danku et al. (2018) Z. Danku, Z. Wang, A. Szolnoki, Imitate or innovate: Competition of strategy updating attitudes in spatial social dilemma games, EPL 121 (2018) 18002.
- Amaral and Javarone (2018) M. A. Amaral, M. A. Javarone, Heterogeneous update mechanisms in evolutionary games: mixing innovative and imitative dynamics, Phys. Rev. E 97 (2018) 042305.
- Rand et al. (2011) D. G. Rand, S. Arbesman, N. A. Christakis, Dynamic social networks promote cooperation in experiments with humans, Proc. Natl. Acad. Sci. USA 108 (2011) 19193–19198.
- Gracia-Lázaro et al. (2012) C. Gracia-Lázaro, J. Cuesta, A. Sánchez, Y. Moreno, Human behavior in Prisoner’s Dilemma experiments suppresses network reciprocity, Sci. Rep. 2 (2012) 325.
- Rand et al. (2014) D. G. Rand, M. A. Nowak, J. H. Fowler, N. A. Christakis, Static network structure can stabilize human cooperation, Proc. Natl. Acad. Sci. USA 111 (2014) 17093–17098.

Comments

There are no comments yet.