DeepAI

# Evolution of Preferences in Multiple Populations

We study the evolution of preferences and the behavioral outcomes in an n-population setting. Each player has subjective preferences over potential outcomes, and chooses a best response based on his preferences and the information about the opponents' preferences. However, players' actual fitnesses are defined by material payoff functions. Players can observe their opponents' preferences with some fixed probability p. We derive necessary and sufficient conditions for stability for p=1 and p=0. We also check the robustness of these results against small perturbations in p for the case of pure-strategy outcomes.

• 2 publications
• 2 publications
10/30/2018

### Evolutionarily Stable Preferences Against Multiple Mutations

We use the indirect evolutionary approach to study evolutionarily stable...
08/02/2020

### Predicting United States policy outcomes with Random Forests

Two decades of U.S. government legislative outcomes, as well as the poli...
02/26/2018

### Controlling Human Utilization of Failure-Prone Systems via Taxes

We consider a game-theoretic model where individuals compete over a shar...
01/31/2022

### FEN-Hedonic Games with Distance-Based Preferences

Hedonic games formalize coalition formation scenarios where players eval...
09/12/2018

### The Convergence of Iterative Delegations in Liquid Democracy

In this paper, we study liquid democracy, a collective decision making p...
07/07/2022

### Evolutionary Stability of Other-Regarding Preferences Under Complexity Costs

The evolution of preferences that account for other agents' fitness, or ...
09/29/2021

### Resource sharing on endogenous networks

We examine behavior in a voluntary resource sharing game that incorporat...

## 1. Introduction

The indirect evolutionary approach is a model for studying the evolution of preferences. In this setting, players choose strategies to maximize their subjective preferences rather than playing pre-programmed strategies, but they receive the actual fitnesses defined by material payoff functions which may be distinct from their preferences. Eventually, evolutionary selection is driven by differences in fitness values. We can think that evolutionary processes shape behavior through the effects on players’ preferences.

This evolutionary approach can be used to explain how behavior appearing inconsistent with material self-interest, such as altruism, vengeance, punishment, fairness, and reciprocity, may be evolutionarily stable. 111See, for example, Güth and Yaari (1992), Güth (1995), Bester and Güth (1998), Huck and Oechssler (1999), and Ostrom (2000). In the indirect evolutionary approach literature, almost every concept of static stability is built on a symmetric two-player game played by a single population of players without identifying their positions. 222Although von Widekind (2008, p. 61) gives a definition of stability for two-population models, he only shows some illustrative examples rather than a general study. However, it is a common phenomenon that players know exactly what their roles are in strategic interactions. For example, they may be males and females, buyers and sellers, employers and employees, or parents and their children. In this paper, we investigate the case of separate populations within the framework of the indirect evolutionary approach: players drawn from different populations may have different action sets and different material payoff functions; every player knows his position and has personal preferences over potential outcomes.

In a standard evolutionary game theoretic model where players are programmed to adopt some strategies, there are two quite different ways of extending the definition of an evolutionarily stable strategy (ESS) from a single-population setting to a multi-population setting. 333Maynard Smith and Price (1973) introduced the concept of an ESS for a symmetric two-player game. The multi-population stability criterion suggested by Taylor (1979) is based on average fitness values aggregated over all player positions. Such a criterion may be particularly appropriate for coevolutionary games. Cressman (1992) introduces a seemingly weaker criterion for multi-population evolutionary stability: the stability is ensured if new entrants earn less in at least one population. Indeed, it can be shown that both criteria are equivalent to the following one: for any -player game, a strategy profile is evolutionarily stable if and only if it is a strict Nash equilibrium. 444Selten (1980) applies the ESS concept to asymmetric two-player games, for which each individual is randomly assigned a player role. It turns out that a strategy in the symmetrized game is an ESS if and only if the associated strategy pair is a strict Nash equilibrium of the asymmetric game. Thus, those two-species definitions followed from Taylor (1979) and Cressman (1992) are all equivalent to this role-conditioned single-population definition; see Swinkels (1992) and Weibull (1995, p. 167). Therefore, the extension of the ESS concept to multiple populations is quite restrictive. However, unlike those of evolutionarily stable strategy profiles, we show that the properties of multi-population stability underlying the indirect evolutionary approach with complete information will depend on which stability concept is adopted, and on how many populations are considered. Some of the arguments we discuss in this paper will help us understand how multiple populations interact with one another in such an environment.

Dekel et al. (2007) use the indirect evolutionary approach to study endogenous preferences in a single-population setting. This study offers two methodological contributions to the work on the evolution of preferences, namely that all possible preferences are allowed, and that various degrees of observability are considered. 555Samuelson (2001) regards the indirect evolutionary approach as incomplete, since only a few possible preferences are considered for applications in some special games, and those new results always rely on the assumption that preferences are perfectly observable. To extend the static stability criterion based on the indirect evolutionary approach for multi-population interactions, we apply the concept of a two-species ESS introduced in Cressman (1992) to the model of Dekel et al. (2007). Remarkably, the objective game, whose entries represent the actual fitnesses, may be either symmetric or asymmetric when focusing on the multi-population cases. Even if the objective game is symmetric, the stable outcomes may still be different because, unlike interactions in a single-population setting, here a preference type never plays against himself.

We suppose that there are large populations, which may be polymorphic, meaning that not all individuals in a population have the same preferences. In any match of players drawn from the separate populations, one for each player position, an equilibrium is played to maximize their preferences based on the information about the opponents. If one preference type receives the highest average fitness in the population, this type will prevail, that is, the population evolves. Naturally, the stability criterion is developed for a configuration, which consists of a distribution of preferences in populations and an equilibrium determining what strategies should be adopted. If a configuration is resistant to invasion by rare mutants, it should have the characteristics: after introduction of a mutant profile, every incumbent will not be wiped out and the post-entry equilibrium behavior gets arbitrarily close to the pre-entry one if the population shares of the mutants are sufficiently small.

Our multi-population stability is defined for any degree of observability, as in Dekel et al. (2007), with which equilibrium behavior is definitely determined. We begin by studying two extreme cases: in one each player can observe the opponents’ preference types, and in the other each player knows only the distribution of opponents’ types. We then consider intermediate cases to investigate the robustness of the results of the two extreme case studies. Because all possible preferences are considered and a lot of preference relations may induce the same best-response correspondence, we are interested in the evolutionarily viable outcome rather than the emergence of one particular preference type.

Under the assumption that preferences are observable, the key feature of the indirect evolutionary approach is that players can adjust their strategies according to specific opponents. Since we allow for all possible preferences to compete, such adjustments made based on preferences can lead to Pareto improvements in fitness outcomes: an inefficient outcome will be destabilized by entrants having the “secret handshake” flavour, which refer to the appropriate mutants playing the inefficient outcome when matched against the incumbents and attaining a more efficient outcome when matched against themselves. 666Robson (1990) demonstrates that any inefficient ESS can be destroyed by the so-called “secret handshake” mutant. Therefore, it is not hard to see that under perfect observability, a configuration is stable only if an equilibrium outcome is Pareto efficient, rather than a Nash equilibrium, in the objective game. This result also indicates that an individual endowed with materialist preferences, which coincide with fitness maximization, may have no evolutionary advantage. 777Heifetz et al. (2007) show a similar result established for almost every game with continuous strategy spaces: under any payoff-monotonic selection dynamics, the population does not converge to material payoff-maximizing behavior.

In the single-population model established by Dekel et al. (2007), if a configuration is stable under complete information, the fitness an incumbent receives in each of his matches is efficient; moreover, the efficient fitness is obviously unique for a symmetric objective game. 888Efficiency of a strategy in a symmetric two-player game means that no other strategy yields a strictly higher fitness when played against itself. 999When preferences are observable, the tendency towards efficient strategy is a general property of single-population models based on the indirect evolutionary approach; see also Possajennikov (2005) and von Widekind (2008).

In our multi-population setting, although the forms of Pareto efficiency may not be unique and the populations are allowed to be polymorphic, the uniqueness of the fitness vector can still be ensured for a stable configuration, in the sense that all equilibrium outcomes adopted by

matched incumbents correspond to the same Pareto-efficient fitness vector. An efficient form of a stable configuration for an objective game is well defined, and it is determined by an initial preference distribution.

When the number of populations is equal to two, we obtain a simple sufficient condition for stability under complete information: a Pareto-efficient strict Nash equilibrium of the objective game is stable. This result would also lead us to see that another concept of multi-population stability, such as in Taylor (1979), can achieve different stability properties. However, if the number of populations increases, then mutants may have opportunities to take evolutionary advantages by applying various correlated deviations regardless of the incumbents’ responses, and so the stability may be difficult to attain. We present several examples of three-player games which are particularly useful in helping us understand how multiple populations interact with one another to destroy Pareto-efficient strict Nash equilibria, even though these are Pareto-efficient strong Nash equilibria. Compared to the results of studies of evolutionarily stable strategies in asymmetric games, once again, the characteristic of the indirect evolutionary approach causes a dramatic change in the existence of stable multi-population configurations.

We then study the case of unobservable preferences, where players know only the distribution of opponents’ preferences in every population; the interactions among players can be described as an -player Bayesian game. Here the stability criterion is consistent with the concept used for perfect observability. In contrast to that of Dekel et al. (2007), our criterion rejects the incumbents’ post-entry strategies that are too far from the originals. We give an example to show how the difference between the two stability criteria affects the determination of the stability of a configuration.

Since players cannot adjust their strategies according to specific opponents in this case, a non-Nash outcome will be destabilized by entrants adopting material payoff-maximizing behavior. It is also easy to see that under incomplete information, individuals endowed with materialist preferences have evolutionary advantages. Thus, whether such a materialist configuration is stable will depend on whether the incumbents’ post-entry strategies are nearly unchanged. We show that a strategy profile can be supported by stable materialist preferences if it is a strict Nash equilibrium or a completely mixed Nash equilibrium or the unique Nash equilibrium of the objective game. The indirect evolutionary approach with unobservable preferences can be viewed as a refinement of the Nash equilibrium concept different from the notion of a neutrally stable strategy (NSS), which was introduced in Maynard Smith (1982).

Finally, we consider the case in which players observe their opponents’ preferences with probability , and know only the distribution of opponents’ preferences with probability . The stability criterion defined in this intermediate case is such that the criteria under perfect observability and no observability can be regarded as its two limits. This makes it possible to check the robustness of the preceding stability results against small perturbations in the degrees of observability for the case of pure-strategy outcomes.

In the single-population model of Dekel et al. (2007), efficiency is a necessary condition for pure-strategy outcomes to be stable when observability is almost perfect. However, we provide a counterexample illustrating that the necessity result in our multi-population setting with perfect observability is not robust. Unlike efficiency defined for the single-population model, a Pareto improvement is not a change leaving everyone strictly better off. Therefore, a Pareto-dominated outcome in our model may not be destabilized if preferences are not perfectly observed. Instead of Pareto efficiency, we show that weak Pareto efficiency is a necessary condition for pure-strategy outcomes to be stable under almost perfect observability. This result reveals that materialist preferences still may have no evolutionary advantage even if preferences are observed with noise. 101010In games with continuous action sets, Heifetz et al. (2007) also show, by means of a specific example, that payoff-maximizing behavior need not prevail when preferences are imperfectly observed. In contrast, the necessity result under no observability is robust: a pure-strategy outcome is stable under almost no observability only if it is a Nash equilibrium of the objective game.

Regarding the sufficient conditions for stability, we show that when the number of populations is equal to two, a Pareto-efficient strict Nash equilibrium of the objective game remains stable for all degrees of observability. However, the sufficiency result under no observability is not robust. We provide an example of a prisoner’s dilemma situation in which the unique Nash equilibrium, also a strict Nash equilibrium, is not stable for any positive probability of observing preferences. The use of entrants’ cooperative strategies in this example indicates that efficiency would play a role in preference evolution as long as preferences are not completely unobservable.

## 2. The Model

### Objective Games

Suppose that is an -player game with the player set . For each , denote by the finite set of actions available to player , and define . Let be the material payoff function for player . When an action profile is played, we interpret the material payoff as the reproductive fitness received by player , which determines the evolutionary success. Thus we also call the fitness function for player . We write the set of mixed strategies of player  as , and denote the set of correlated strategies by . Each material payoff function can be extended by linearity to a continuous function defined on the set , or defined on the set .

By combining the material payoff functions, we obtain the vector-valued function that assigns to each action profile the -tuple of fitness values. This fitness function can also be extended to the set , or to the set , through . In the indirect evolutionary approach, behavior of players is determined independently of the material payoff functions, although players’ actual fitnesses are defined by them. We call this game an objective game.

### Subjective Games

In contrast to an objective game, a subjective game describes the strategic interactions among the players. There are separate populations, and the number of individuals in each population is infinite. In every game round, players are drawn independently from the populations, one from each population randomly. Let such a game be repeated infinitely many times independently; then it is plausible that a player will not take into account the effect of his current behavior on the opponents’ future behavior.

Each player in the -th population chooses an optimal strategy from the set based on his own preferences and the information about his opponents’ preferences. We assume that the subjective preferences of player  can be represented by a von Neumann–Morgenstern utility function which may be different from the material payoff function . However, after each round of play, the actual fitness received by player  is if a strategy profile is chosen by matched players. Let , which represents the set of von Neumann–Morgenstern utility functions on . We refer to a utility function either as a preference type or as a type; we identify it with a group of players who have such preferences and make the same decisions. In real life, different individuals having the same preference relation may adopt distinct strategies when they are indifferent among some alternatives. In such a case, any two of these preferences can be represented by different utility functions congruent modulo a positive affine transformation.

Assume that there are only a finite number of preference types in each population. Denote by

the set of all possible joint distributions of

independent random variables defined on the same sample space

with finite support. Let . Then the support of can be written as , where is the marginal distribution over all types of player . For a given preference profile and for , the conditional probability is equal to . For notational simplicity, we write and for and , respectively. Similarly, we write and for and , respectively, where is a nonempty proper subset of .

To enable a comparison with the single-population setting introduced in Dekel et al. (2007), hereafter DEY, information about opponents’ types is described as follows. For every , player  observes the opponents’ preferences with probability , and knows only the joint distribution over the opponents’ preferences with probability . 111111Here, partial observability is used to model the noise in the cases of perfect observability and no observability. For simplicity, we ignore the possibility that an individual has complete information about the preferences of some of the opponents and has incomplete information about the preferences of the others. We emphasize that the difference in the two noise settings does not affect our results. The degree of observability is an exogenous parameter indicating the level of observation, and is common knowledge among all players. But the two realizations, called perfect observability and no observability, are private and independent across players. In every round, players choose best responses to expected actions of others under a given degree of observability, which can be described as an -player Bayesian game. Such a game is denoted by and called a subjective game; the pair is referred to as an environment.

### The Stability Concept

According to the principle of the “survival of the fittest”, only preference types earning the highest average fitness will survive. Thus, a necessary condition for an environment to be stable is that for any , all incumbents in the -th population, which constitute the set , should receive the same average fitness. On the other hand, it is necessary to verify whether the incumbents are immune to the competition from new entrants. Because mutations are rare events, it is assumed that at the same time, there will be at most one mutant type arising in each of the populations. Let be any nonempty subset of . A mutant sub-profile for , denoted by , refers to a -tuple of preference types derived from , where is the complement of . 121212It indicates that all preference relations satisfying the von Neumann–Morgenstern axioms are allowed to compete, and that mutants are distinguishable from the incumbents in the post-entry populations, although they may have the same preference relation. The vector of the population shares of the mutant types is often denoted simply by , and we define its norm to be . 131313Since each population is assumed to be infinite, the population share of a mutant type can take on any positive value, no matter how small it may be. After the mutants have entered, the resulting populations can be characterized by the post-entry distribution :

 ˜μεi={(1−εi)μi+εiδ˜θiif i∈J,μiif i∈N∖J,

where is the degenerate probability (Dirac measure) concentrated at .

In a single-population evolutionary model, the static stability criterion generally requires that mutants entering the population are eventually driven to extinction. For multi-population settings, it should be natural to us to extend the stability definition in terms of the notion of a mutant sub-profile, which would be regarded as a unit of mutation. We say that a mutant sub-profile is driven out if one of these mutant types will become extinct. In other words, a multi-population stability criterion can be fulfilled if for any given mutant sub-profile, there are mutants earning a lower average fitness than the incumbents in at least one population. The reason for this is that in the models based on the indirect evolutionary approach, interactions among mutants may look as if a sub-profile of mutants cooperate with one another such that some of the mutants take fitness advantages at the expense of the other mutants. Those fitness advantages will soon disappear when the latter go extinct. This concept is consistent with that of the multi-population ESS formulated by Cressman (1992), one of the two most popular static stability criteria for multi-population interactions. 141414There are two popular ways of extending the definition of an ESS to a multi-population setting. One is due to Taylor (1979), and the other is due to Cressman (1992); see also Swinkels (1992), Weibull (1995), and Sandholm (2010).

Because there are no restrictions on the preference relations of entrants and the best-response correspondences for different preferences may coincide, we allow that mutants may survive in a post-entry environment to coexist with the incumbents, but will not spread. 151515This idea is consistent with the notion of a neutrally stable strategy. Our stability criterion is defined to identify when a joint preference distribution and an adopted Bayesian–Nash equilibrium can form a stable configuration. Except that all incumbents in the same population earn the same average fitness, a stable multi-population configuration should satisfy: after a rare mutant sub-profile appears,

1. the behavioral outcomes remain unchanged or nearly unchanged;

2. the mutant sub-profile is driven out, or the incumbents can coexist with the mutants in every population.

Along these lines, although the failure of a mutant sub-profile is determined solely by one mutant type among them, such multi-population extension can quite satisfy the appropriate stability conditions as introduced in DEY: there is no single mutant sub-profile that can obviously destabilize the configuration including the behavioral outcomes and the distribution of preferences (see Remarks 3.2.1 and 3.3.1).

We shall formally define stability criteria for various degrees of observability in the following sections; these definitions follow the same principle of multi-population stability. We develop necessary and sufficient conditions for stability beginning with the two extreme cases, (perfect observability) and (no observability). Next, we consider intermediate cases, (partial observability), such that the two extreme cases can be regarded as its two limiting cases. We check whether or not the preceding results, under and , are robust against small perturbations in for the case of pure-strategy outcomes.

## 3. Perfect Observability

In this section we discuss the case where the degree of observability is equal to one, that is, players’ preferences are common knowledge. Therefore, the subjective game in each round of play can be seen as an -player strategic game.

For a given probability distribution

, a strategy adopted by player  under perfect observability is a function . 161616It is admitted that players whose preference types are congruent modulo a positive affine transformation may adopt distinct strategies. The vector-valued function defined by is an equilibrium in the subjective game if for each , the strategy profile is a Nash equilibrium of the associated strategic game plated by the -tuple , that is, for each ,

 bi(θ)∈argmaxσi∈Δ(Ai)θi(σi,b−i(θ)),

where we write instead of . Let denote the set of all such equilibria in . A pair consisting of a preference distribution and an equilibrium is called a configuration. We define the aggregate outcome of a configuration as the probability distribution over the set of pure-strategy profiles:

 φμ,b(a1,…,an)=∑θ∈suppμμ(θ)∏i∈Nbi(θ)(ai)

for every , where is the probability assigned by to . The aggregate outcome of a configuration can be regarded as a correlated strategy belonging to . A strategy profile is called an aggregate outcome if the induced correlated strategy is the aggregate outcome of some configuration, where is defined by for all .

By applying the law of large numbers to our setting, the

average fitness of a type with respect to for is given by

 Πθi(μ;b)=∑θ′−i∈suppμ−iμ−i(θ′−i)πi(b(θi,θ′−i)),

on which the evolution of preferences depends. 171717The equation for average fitness indicates that the preference distribution is unchanged in the process of learning to play an equilibrium. To justify this representation, we assume as in most related literature that the evolution of preferences is infinitely slower than the process of learning, which is supported by Selten (1991, p. 21). For a configuration to be evolutionarily stable, it is necessary to let every incumbent in the same population earn the same average fitness.

###### Definition 3.1.

A configuration is said to be balanced if for each , the equality holds for every .

Under complete information, in order to satisfy the condition that rare mutants cannot cause the behavioral outcomes to move far away, we assume that when incumbents are matched against one another, they continue to play the pre-entry equilibrium, called the focal property. However, after an entry, it seems implausible that we can say which equilibrium will be played when not all matched players are incumbents. Thus, in such matches, there are no restrictions on the set of equilibria from which they can choose.

###### Definition 3.2.

Let be a configuration with perfect observability, and suppose that a mutant sub-profile with is introduced. In , a post-entry equilibrium is focal relative to if for every . Let be the set of all focal equilibria relative to in , called a focal set.

###### Remark 3.2.1.

When preferences are observable, the focal set is always nonempty regardless of how the population share vector is composed. In addition, the desired property that as is naturally held for all .

Now the stability criterion for a perfectly observable environment can be defined.

###### Definition 3.3.

In , a configuration is said to be stable if it is balanced, and if for any nonempty subset and any mutant sub-profile , there exists some such that for every with and every , either Condition 1 or Condition 2 is satisfied:

1. [label=()]

2. for some and for every ;

3. for every and for every .

A strategy profile is stable if it is the aggregate outcome of some stable configuration.

Condition 1 describes the case where every incumbent earns a strictly higher average fitness than the mutant type in some population, and thus this mutant sub-profile fails to invade. Condition 2 describes the case where the post-entry configuration is balanced; the incumbents and the mutants can continue to coexist in every

population. Let us for a moment follow the notion of evolutionary stability introduced in

Taylor (1979), where the fitness comparison between incumbents and mutants should be done in the aggregate. Under such a view, the form of new entrants could be represented as with a population share , and Condition 1 for a polymorphic configuration could be replaced by the condition:

 ∑i∈NΠθi(˜με;˜b)>∑i∈NΠˆθi(˜με;˜b)

holds for every . Unlike extending the definition of an ESS to a multi-population setting, such a stability criterion is indeed stronger than ours. The key difference is the ability to adjust one’s strategies according to the opponents; see the discussion after Theorem 3.10.

The following remarks describe the basic characteristics of our multi-population stability criterion, which is defined separately for each information assumption.

###### Remark 3.3.1.

If the multi-population stability criterion is reached, then no incumbent would be wiped out, although Condition 1 can be determined just by examining one of the populations for each mutant sub-profile. The intuition behind this result is as follows. Let a mutant sub-profile be introduced into a stable configuration , and suppose that Condition 1 holds. Then those mutants with the lowest average fitness in their own populations will be wiped out. Such a trend can make the sub-profile converge to a smaller sub-profile . Meanwhile, the population shares of the incumbents may be slightly perturbed during this process. One could therefore regard the post-entry environment at this time as the perturbed configuration in which the new sub-profile tries its luck. Recall that the stable configuration is defined for any mutant sub-profile. Thus, after the entry of , either Condition 1 or Condition 2 would be satisfied, provided that the population shares of these remaining mutants are sufficiently small and that the perturbations in the population shares of the incumbents do not affect the order of their average fitness values. Such a repeated process would lead to the desired goal.

###### Remark 3.3.2.

In our stability criterion, the invasion barrier seems to depend on the mutant sub-profile. In fact, the existence of such an invasion barrier is equivalent to the existence of a uniform invasion barrier. Consider an indifferent mutant sub-profile for a nonempty subset of . 181818A preference type is said to be indifferent if it is a constant utility function; a mutant sub-profile is said to be indifferent if all its preference types are indifferent. Since an indifferent type is indifferent among all actions, all available actions will be dominant for him. Hence, by the condition that all possible focal equilibria are admitted in a post-entry environment, the barrier which works for the indifferent mutant sub-profile is certainly a uniform invasion barrier against all mutant sub-profiles for the subset . Thus, since the number of all subsets of is finite, we have a uniform invasion barrier that can work for all potential mutant sub-profiles. This also indicates that it is sufficient to check indifferent mutant sub-profiles, rather than all mutant sub-profiles, in order to test the stability of a configuration.

Consider the Battle of the Sexes as introduced in Dawkins (1976), which refers to the male–female conflict over parental care of offspring, and which is one of the most simple asymmetric games without an ESS. The two female strategies are coy and fast; the two male strategies are faithful and philandering. Coy females insist on a long courtship before mating, whereas fast ones do not. All females take care of their offspring. Faithful males tolerate a long courtship, and also care for the offspring. Philanderers refuse to engage in a long courtship, and do not care for their offspring. The value of the offspring to each parent is . The cost of raising the offspring is , which can be borne by one parent only, or shared equally between both parents. The cost of a long courtship is to each participant. Let us consider . Then, as discussed in van Damme (1991, p. 243), there is no strict Nash equilibrium, and so there is no ESS in such a case. However, if the Battle of the Sexes is modeled by means of our multi-population setting, then an evolutionarily stable outcome can exist in this game, as we will see below.

###### Example 3.4.

Let the fitness assignment be characterized by the Battle of the Sexes game in which , , and ; besides, the first population consists of all males, and the second population consists of all females.

Let preferences be observable. Suppose that all males have preferences such that they are indifferent between being faithful and being philandering if a female is fast. On the other hand, suppose that females’ preferences prompt them to be coy if a male is a philanderer, and to be fast if a male is faithful. Then the pair is a Nash equilibrium for such males and females, and this strategy pair can be an evolutionarily stable outcome.

To see this, let be the configuration constructed as described above. It is obvious that if mutants appear in only one of the two populations, their average fitnesses cannot be greater than what the incumbents have. Consider two types of mutants and entering the first and second populations with population shares and , respectively. Let be the chosen focal equilibrium, and suppose that for and for , we have where . If , then the post-entry average fitnesses of and satisfy and , respectively. It follows that whenever . Clearly if and , then the post-entry average fitness of is equal to , and it is strictly greater than that of . Finally if and , then the fitness to each individual in each match is except in . When the mutants and are matched together, the Pareto efficiency of the strategy pair implies that the conditions affording mutants an evolutionary advantage in the first population can result in fitness loss for mutants in the second population. Therefore, we can conclude that the strategy pair is stable, for which a uniform invasion barrier can be choosen as .

We list the definitions concerning Pareto efficiency.

###### Definition 3.5.

Let be a finite strategic game, and let and be strategy profiles belonging to . The strategy profile strongly Pareto dominates the strategy profile if for all . The strategy profile is weakly Pareto efficient if there does not exist another strategy profile that strongly Pareto dominates .

The strategy profile Pareto dominates the strategy profile if for all and for some . The strategy profile is Pareto efficient if there does not exist another strategy profile that Pareto dominates .

In Example 3.4, the stable outcome is a Pareto-efficient strategy profile in the Battle of the Sexes game. Our first result will show that this is also true in the general case: if a configuration is stable under perfect observability, then any equilibrium outcome adopted by matched incumbents must be Pareto efficient with respect to the fitness function . The reason is simple. If the outcome is not Pareto efficient, the mutants having the “secret handshake” flavour can destroy this inefficient outcome. These mutants behave based on their own preferences; they maintain the pre-entry outcome when matched against the incumbents, and achieve a more efficient outcome when matched against themselves. Accordingly, the observability of preferences plays a key role in obtaining the “stable only if efficient” result.

It is convenient to use the following notation for our multi-population case. Suppose that we are given two -tuples and . For any subset , a new -tuple can be constructed by letting

 (zT,x−T)i={ziif i∈T,xiif i∈N∖T.

If or , then refers to or , respectively.

###### Theorem 3.6.

Let be a stable configuration in . Then for each , the equilibrium outcome is Pareto efficient with respect to .

###### Proof.

Suppose that there exists such that is not Pareto efficient, that is, there exists such that for all and for some . Let an indifferent mutant profile be introduced with its population share vector . Let the focal equilibrium be chosen to satisfy (1) ; (2) for any proper subset and any , we have . 191919In most cases, indifferent types are used instead of potential entrants that are well adapted to the environments, without specifying their preferences explicitly. We will frequently use this convenient device throughout the paper. Then, for every , the difference between the average fitnesses of and is

 Π˜θ0i(˜με;˜b)−Π¯θi(˜με;˜b)=˜με−i(˜θ0−i)[πi(σ)−πi(b(¯θ))].

Thus, for any vector , we have for every , and for some . This means that the configuration is not stable. ∎

A single-population model underlying the indirect evolutionary approach, as in DEY, always shows that when preferences are observed, efficiency is a necessary condition for stability, in the sense that the fitness each incumbent receives in each interaction is efficient. Of course, the concept of efficiency specially defined for symmetric games is distinct from the concept of Pareto efficiency. The efficient fitness used in a single-population model could not be meaningfully applied to a model with separate populations.

Unlike strategic interactions in a single-population setting, here an individual will only meet opponents coming from the other populations. Thus the same symmetric objective game considered in different population settings could induce quite different stable outcomes, as we will see in the next example.

###### Example 3.7.

Let the following anti-coordination game denote the fitness assignment, where , , and . Suppose that preferences are observable, and that each player  has the same action set .

The efficient strategy in this symmetric objective game is such that with the efficient fitness . When considered in the single-population model introduced by DEY, the unique efficient strategy profile is DEY-stable if and only if the equality holds. Therefore, if the DEY-stable outcome exists, the efficient fitness is , which is strictly less than and .

In the case where , all Pareto-efficient strategy profiles in this objective game are and , and so Theorem 3.6 implies that the strategy profile cannot be stable in the sense of multi-population stability. The reason for the difference between the single- and two-population settings is that when the interaction takes place between two mutants from separate populations, they can choose a suitable strategy profile, or , to gain evolutionary advantages. However, this cannot happen in a single-population setting, where the mutant type a mutant can encounter is himself. Theorem 3.10 will guarantee that the Pareto-efficient strict Nash equilibria and can be stable if the two-population setting is applied to this symmetric game.

Theorem 3.6 says that configurations in our multi-population model tend towards Pareto efficiency whenever preferences are observable. Unlike the efficient fitness for a symmetric game, Pareto-efficient fitness vectors are generally not unique for an arbitrary game. Nevertheless, we can show that if a configuration is stable under perfect observability, the incumbents in the same population always earn the same fitness in each of their interactions, no matter who their opponents are; the fitness vector for any tuple of matched individuals is unique, no matter who their members are.

###### Lemma 3.8.

Let be a stable configuration in . Then the equality holds for every .

###### Proof.

We will prove the lemma by proving its equivalent statement: If is a stable configuration in , then for any nonempty subset and any , the equality

 πi(b(θ′S,θ−S))=πi(b(θ′′S,θ−S))

is valid for all and all .

We begin with the case that satisfies . Suppose that there exist for some such that for some . Let be an indifferent type entering the -th population with a population share . Let be the adopted equilibrium satisfying and for any . Then, by comparing the average fitness of with that of , we have

 Π˜θ0j(˜με;˜b)−Πθ′′j(˜με;˜b)=μ−j(¯θ−j)[πj(b(θ′j,¯θ−j))−πj(b(θ′′j,¯θ−j))]>0

for any . This means that is not stable in .

Let , and suppose as an inductive hypothesis that the equality holds for a subset whenever the number, say , of satisfies . Now let be a subset of with . Suppose that there exist and such that for some . Consider an indifferent mutant sub-profile with its population share vector , where denotes . The focal equilibrium can be chosen to satisfy: for any and any , we have if , and otherwise .

By our inductive hypothesis, it is not hard to see that for each and each . To verify this, compare fitness values received by and when matched against the same opponents. For example, if the opponents are , , and , where denotes , then the properties of imply that the fitness of is , and that the fitness of is . Of course, the two fitness values are equal by our inductive hypothesis. On the other hand, the difference between the average fitnesses of and can be obtained as

 Πθ′j(˜με;˜b)−Πθ′′j(˜με;˜b)=μ−j(˜θ0T−j,¯θ−T)[πj(b(θ′T,¯θ−T))−πj(b(θ′′T,¯θ−T))]>0

regardless of the population share vector . Thus, is not stable in , as desired. ∎

When preferences are observable, Theorem 3.6 and Lemma 3.8 imply that although these separate populations may be polymorphic, a stable configuration induces a unique fitness vector lying on the Pareto frontier of the noncooperative payoff region of the objective game. 202020The noncooperative payoff region of an -player game refers to the -dimensional range . The average fitness of an incumbent is equal to the fitness value that all incumbents in the same population can earn in each of their matches. Besides, the fitness vector corresponding to a stable aggregate outcome just consists of the fitness values obtained from any matching of the incumbents.

###### Theorem 3.9.

Let be a stable configuration in , and let be the aggregate outcome of . Then for each ,

 Π¯θi(μ;b)=πi(b(θ))=πi(φμ,b)

for any and any .

###### Proof.

Since is stable, by Lemma 3.8, we let for . Then for each and any , the equality is obvious. On the other hand, for any , we have

 πi(φμ,b)=∑a∈Aφμ,b(a)πi(a)=∑θ∈suppμμ(θ)∑a∈A[∏s∈Nbs(θ)(as)]πi(a)=v∗i

since for all . ∎

For any symmetric two-player game considered in a single-population setting, the efficient fitness is certainly uniquely determined for all stable configurations. However, for an arbitrary objective game, stable configurations in our multi-population model may correspond to different Pareto-efficient fitness vectors; the efficient types of stable aggregate outcomes would be determined by initial distributions of preferences. As in Example 3.7, the strategy profiles and with different efficient types could serve as two stable aggregate outcomes supported, respectively, by different preference distributions. This can be confirmed after studying the sufficient condition for stability.

In the single-population model of DEY with complete information, it is shown that efficient strict Nash equilibria of a symmetric two-player game are stable; see also Possajennikov (2005). In the following theorem, we give a sufficient condition for the two-population setting: in an arbitrary two-player game, every Pareto-efficient strict Nash equilibrium can be an evolutionarily stable outcome. At first glance, it seems easy to understand. Suppose that each strategy in a strict Nash equilibrium of an objective game is supported by preferences for which the strategy is strictly dominant. Then any mutant type with a small population share will be wiped out if the mutants adopt any other strategy when matched against the incumbents. On the other hand, Pareto efficiency implies that when two mutants from separate populations are matched, the fitness of one mutant type cannot be improved without worsening the fitness of the other. All this seems quite straightforward. However, unlike in the single-population setting, it is difficult to find a uniform invasion barrier valid for all focal equilibria in the two-population setting. 212121Note that perfect observability is a limiting case of partial observability, which will be studied in Section 5. When preferences are unobservable, each individual knows the joint distribution over the types before deciding which strategy will be adopted. This implies throughout the paper that the stability criterion can only be defined by taking a uniform invasion barrier against all focal equilibria, rather than an invasion barrier depending on a given focal equilibrium.

###### Theorem 3.10.

Let be a two-player game and let be Pareto efficient with respect to . If is a strict Nash equilibrium of , then is stable in for some .

###### Proof.

Let be a strict Nash equilibrium of , and suppose that it is not a stable strategy profile. We shall show that is not Pareto efficient with respect to . To see this, consider a monomorphic configuration where each -th population consists of for which is the strictly dominant strategy. Then is the aggregate outcome of , and hence this configuration cannot be stable under our assumptions on . This means that there exists a mutant sub-profile for some such that for every , these mutants, with some satisfying , can adopt an equilibrium to gain evolutionary advantages over the incumbents, that is, for all , and for some .

In the case where , it is clear that mutants have no evolutionary advantage since is a Nash equilibrium. Let , and suppose that is a mutant pair having an evolutionary advantage. For a given and for each , the post-entry average fitness of the incumbent type is

 Πθ∗i(˜με;˜b)=(1−ε−i)πi(a∗1,a∗2)+ε−iπi(a∗i,˜b−i(θ∗i,˜θ−i)).

On the other hand, the post-entry average fitness of the mutant type is

 Π˜θi(˜με;˜b)=(1−ε−i)πi(˜bi(˜θi,θ∗−i),a∗−i)+ε−iπi(˜b(˜θ1,˜θ2)).

Using the assumption that the mutant pair has an evolutionary advantage, we gradually reduce to , and then the sequence of the norms of the corresponding population share vectors converges to . We can choose a sequence from the corresponding focal equilibria such that one of the three following cases occurs. To complete the proof, we will show that is Pareto dominated in any one of these cases.

Case 1: for each and each . Since has an evolutionary advantage, it follows that each Pareto dominates .

Case 2: and for fixed and for every . Without loss of generality, suppose that . Let , where and for all . Since has an evolutionary advantage and is a strict Nash equilibrium, we have

 εt21−εt2≥ζt1[π1(a∗1,a∗2)−π1(σt1,a∗2)]π1(˜bt(˜θ1,˜θ2))−π1(a∗1,a