Strong Amplifiers of Natural Selection: Proofs

02/07/2018 ∙ by Andreas Pavlogiannis, et al. ∙ 0

We consider the modified Moran process on graphs to study the spread of genetic and cultural mutations on structured populations. An initial mutant arises either spontaneously (aka uniform initialization), or during reproduction (aka temperature initialization) in a population of n individuals, and has a fixed fitness advantage r>1 over the residents of the population. The fixation probability is the probability that the mutant takes over the entire population. Graphs that ensure fixation probability of 1 in the limit of infinite populations are called strong amplifiers. Previously, only a few examples of strong amplifiers were known for uniform initialization, whereas no strong amplifiers were known for temperature initialization. In this work, we study necessary and sufficient conditions for strong amplification, and prove negative and positive results. We show that for temperature initialization, graphs that are unweighted and/or self-loop-free have fixation probability upper-bounded by 1-1/f(r), where f(r) is a function linear in r. Similarly, we show that for uniform initialization, bounded-degree graphs that are unweighted and/or self-loop-free have fixation probability upper-bounded by 1-1/g(r,c), where c is the degree bound and g(r,c) a function linear in r. Our main positive result complements these negative results, and is as follows: every family of undirected graphs with (i) self loops and (ii) diameter bounded by n^1-ϵ, for some fixed ϵ>0, can be assigned weights that makes it a strong amplifier, both for uniform and temperature initialization.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The Moran process. Evolutionary dynamics study the change of population over time under the effect of natural selection and random drift [28]. The Moran process [27] is an elegant stochastic model for the rigorous study of how mutations spread in a population. Initially, a population of individuals, called the residents, exists in a homogeneous state, and a random individual becomes mutant. The mutants are associated with a fitness advantage , whereas the residents have fitness normalized to 1. The Moran process is a discrete-time stochastic process, described as follows. In every step, a single individual is chosen for reproduction with probability proportional to its fitness. This individual produces a single offspring (a copy of itself), which replaces another individual chosen uniformly at random from the population. The main quantity of interest is the fixation probability , defined as the probability that the single invading mutant will eventually take over the population. As typically is small (i.e., , for some small ) and is large, we study the fixation probability at the limit of large populations, i.e., . It is known that .

The Moran process on graphs. The standard Moran process takes place on well-mixed populations where the reproducing individual can replace any other in the population. However, natural populations have spatial structure, where each individual has a specific set of neighbors, and mutation spread must respect this structure. Evolutionary graph theory represents spatial structure as a (generally weighted, directed) graph, where each individual occupies a vertex of the graph, and edges define interactions between neighbors [22]. The Moran process on graphs is similar to the standard Moran process, with the exception that the offspring replaces a neighbor of the reproducing individual. The well-mixed population is represented by the complete graph . If the graph is strongly connected, the Moran process is guaranteed to reach a homogeneous state where mutants either fixate or go extinct.

Mutant initialization. The asymmetry introduced by the population structure makes the fixation probability depend on the placement of the initial mutant. In uniform initialization, the initial mutant arises spontaneously, i.e., uniformly at random on each vertex. In temperature initialization, the initial mutant arisesduring reproduction i.e., on each vertex with probability proportional to the rate that the vertex is replaced by offspring from its neighbors. Hence our interested is on the fixation probability for a weighted graph of vertices and under initialization , denoting uniform and temperature initialization, respectively.

Amplifiers of selection. Population structure affects the fixation probability of mutants. An infinite family of graphs is amplifying for initialization if , Intuitively, the fitness advantage of mutants is being “amplified” by the structure compared to the well-mixed population. Strong amplifying families have , and hence ensure the fixation of mutants. On the other hand, bounded amplifiers have , where is a linear function, and hence provide limited amplification at best.

Existing results. The Moran process on graphs was introduced in [22], where several amplifying and strongly amplifying families were presented. Under uniform initialization, the canonical example is the family of undirected Star graphs, with fixation probability , making it a quadratic uniform amplifier [22, 5, 26]. Among directed graphs, strongly amplifying families are known to exist: (i) Superstars and Metafunnels were already introduced in [22], where their strong amplifying properties were outlined, and (ii) more recently, the family of Megastars was rigorously proved to be a strong amplifying family [13]. Megastars were subsequently shown to be optimal (up to logarithmic factors) wrt the rate that fixation probability converges to 1 as a function of  [15]. Among undirected graphs, the family of Stars was the best amplifying family know for a long time, and the existence of strong amplifiers was open. Recently, undirected strong amplifiers were presented independently in [15] and [14].

Under temperature initialization, the landscape is more scarce. None of the uniform amplifiers mentioned in the previous paragraph is a temperature amplifier. It turns out that on all those structures the mutants go extinct with high probability when the initial placement is according to temperature. Recently, the Looping Star family was introduced in [1] and was shown to be a quadratic amplifier under both initialization schemes. Crucially, Looping Stars contain self-loops and weights. To our knowledge, no other temperature amplifier has been known.

Our contributions. In this work, we study necessary and sufficient conditions for strong amplifiers, and prove negative and positive results.

  1. Our negative results are as follows. For temperature initialization, we show that graphs which are unweighted and/or self-loop-free have fixation probability upper-bounded by , where is a function linear in . Hence, without both weights and self-loops, there are only bounded temperature amplifiers. Similarly, we show that for uniform initialization, bounded-degree graphs that are unweighted and/or self-loop-free have fixation probability upper-bounded by , where is the degree bound and a function linear in . Hence, without both weights and self-loops, bounded-degree graph families are only bounded uniform amplifiers.

  2. Our positive result complements these negative results and is as follows. We show that every family of undirected graphs with (i) self loops and (ii) diameter bounded by , for some fixed , can be assigned weights that makes the family a strong amplifier, both for uniform and temperature initialization. Moreover, the weight construction requires time.

Our proof techniques rely on the analysis of Markov chains, the Cauchy-Schwarz inequality, concentration bounds, stochastic domination and coupling arguments. The weight construction in our positive result is straightforward, however proving the amplification properties of the resulting structure is more involved.

1.1 Other Related Work

Strong amplifiers were already introduced in [22]

, however it was later shown that the fixation probability on Superstars is weaker than originally stated, and hence the heuristic argument for strong amplification cannot be made formal 

[7]. In [13], it was shown that the fixation probability on Superstars as appeared in [22] is indeed too optimistic, by proving an upper bound on the rate that the probability can tend to 1 as a function of . A revised analysis of Superstars appeared in [17]. The work of [30] introduced the Metastars as a family of unweighted undirected graphs with better amplification properties than Stars, for specific values of the fitness advantage . Other aspects of the Moran process on graphs have also been studied in the literature. In [24], the authors studied undirected suppressors of selection, which are graphs that suppress the selective advantage of mutants, as opposed to amplifying it. Recently, a family of strong suppressors was presented [14]. The work of [25] studies selective amplifiers, a notion that characterizes the number of initial vertices that guarantee mutant fixation. Randomly structured populations were shown to have no effect on fixation probability in [2]. Besides the fixation probability, the absorption time of the Moran process is crucial for characterizing the rate of evolution [11] and has been studied on various graphs [9]. Finally, computational aspects of computing the fixation probability on graphs were studied in [8], where the problem was shown to admit a fully polynomial randomized approximation scheme, later improved in [6].

2 Organization

The organization of this document is as follows: Before presenting our proofs we present the detailed description of our model and the results in Section 2. We then present the formal notation (Section 3), the proofs of our negative results (Section 4) and the proofs of our positive results (Section 5).

3 Model and Summary of Results

3.1 Model

The birth-death Moran process. The Moran process considers a population of individuals, which undergoes reproduction and death, and each individual is either a resident or a mutant [27]. The residents and the mutants have constant fitness 1 and , respectively. The Moran process is a discrete-time stochastic process defined as follows: in the initial step, a single mutant is introduced into a homogeneous resident population. At each step, an individual is chosen randomly for reproduction with probability proportional to its fitness; another individual is chosen uniformly at random for death and is replaced by a new individual of the same type as the reproducing individual. Eventually, this Markovian process ends when all individuals become of one of the two types. The probability of the event that all individuals become mutants is called the fixation probability.

The Moran process on graphs. In general, the Moran process takes place on a population structure, which is represented as a graph. The vertices of the graph represent individuals and edges represent interactions between individuals [22, 28]. Formally, let be a weighted, directed graph, where is the vertex set , is the Boolean edge matrix, and is a stochastic weight matrix. An edge is a pair of vertices which is indicated by and denotes that there is an interaction from to (whereas we have if there is no interaction from to ). The stochastic weight matrix assigns weights to interactions, i.e., is positive iff , and for all we have . For a vertex , we denote by (resp., ) the set of vertices that have incoming (resp., outgoing) interaction or edge to (resp., from)

. Similarly to the Moran process, at each step an individual is chosen randomly for reproduction with probability proportional to its fitness. An edge originating from the reproducing vertex is selected randomly with probability equal to its weight. The terminal vertex of the chosen edge takes on the type of the vertex at the origin of the edge. In other words, the stochastic matrix

is the weight matrix that represents the choice probability of the edges. We only consider graphs which are connected, i.e., every pair of vertices is connected by a path. This is a sufficient condition to ensure that in the long run, the Moran process reaches a homogeneous state (i.e., the population consists entirely of individuals of a single type). See Figure 1 for an illustration. The well-mixed population is represented by a complete graph where all edges have equal weight of .

Figure 1: Illustration of one step of the Moran process on a weighted graph with self-loops. Residents are depicted as red vertices, and mutants as blue vertices. As a concrete example, we consider the relative fitness of the mutants is . In Figure 1(A), the total fitness of the population is , and hence the probability of selecting resident (resp., mutant) for reproduction equals (resp., ). The mutant reproduces along an edge, and the edge is chosen randomly proportional to the edge weight. Figure 1(B) shows that different reproduction events might lead to the same outcome.

Classification of graphs. We consider the following classification of graphs:

  1. Directed vs undirected graphs. A graph is called undirected if for all we have . In other words, there is an edge from to iff there is an edge from to , which represents symmetric interaction. If a graph is not undirected, then it is called a directed graph.

  2. Self-loop free graphs. A graph is called a self-loop free graph iff for all we have .

  3. Weighted vs unweighted graphs. A graph is called an unweighted graph if for all we have

    In other words, in unweighted graphs for every vertex the edges are choosen uniformly at random. Note that for unweighted graphs the weight matrix is not relevant, and can be specified simply by the graph structure . In the sequel, we will represent unweighted graphs as .

  4. Bounded degree graphs. The degree of a graph , denoted , is , i.e., the maximum in-degree or out-degree. For a family of graphs we say that the family has bounded degree, if there exists a constant such that the degree of all graphs in the family is at most , i.e., for all we have .

Initialization of the mutant. The fixation probability is affected by many different factors [29]. In a well-mixed population, the fixation probability depends on the population size and the relative fitness advantage of mutants [23, 28]. For the Moran process on graphs, the fixation probability also depends on the population structure, which breaks the symmetry and homogeneity of the well-mixed population [21, 20, 10, 22, 5, 12, 31, 16]. Finally, for general population structures, the fixation probability typically depends on the initial location of the mutant [3, 4], unlike the well-mixed population where the probability of the mutant fixing is independent of where the mutant arises [23, 28]. There are two standard ways mutants may arise in a population [22, 1]. First, mutants may arise spontaneously and with equal probability at any vertex of the population structure. In this case we consider that the mutant arise at any vertex uniformly at random and we call this uniform initialization. Second, mutants may be introduced through reproduction, and thus arise at a vertex with rate proportional to the incoming edge weights of the vertex. We call this temperature initialization. In general, uniform and temperature initialization result in different fixation probabilities.

Amplifiers, quadratic amplifiers, and strong amplifiers. Depending on the initialization, a population structure can distort fitness differences [22, 28, 5], where the well-mixed population serves as a canonical point of comparison. Intuitively, amplifiers of selection exaggerate variations in fitness by increasing (respectively decreasing) the chance of fitter (respectively weaker) mutants fixing compared to their chance of fixing in the well-mixed population. In a well-mixed population of size , the fixation probability is

Thus, in the limit of large population (i.e., as ) the fixation probability in a well-mixed population is . We focus on two particular classes of amplifiers that are of special interest. A family of graphs is a quadratic amplifier if in the limit of large population the fixation probability is . Thus, a mutant with a 10% fitness advantage over the resident has approximately the same chance of fixing in quadratic amplifiers as a mutant with a 21% fitness advantage in the well-mixed population. A family of graphs is an arbitrarily strong amplifier (hereinafter called simply a strong amplifier) if for any constant the fixation probability approaches 1 at the limit of large population sizes, whereas when , the fixation probability approaches 0. There is a much finer classification of amplifiers presented in [1]. We focus on quadratic amplifiers which are the most well-known among polynomial amplifiers, and strong amplifiers which represent the strongest form of amplification.

Amplifiers tend to have fixation times longer than the well mixed population. Therefore they are especially useful in situations where the rate limiting step is the discovery and evaluation of marginally advantageous mutants. An interesting direction for future work would be to consider amplifiers as well as the time-scale of evolutionary trajectories.

Existing results. We summarize the main existing results in terms of uniform and temperature initialization.

  1. Uniform initialization. First, consider the family of Star graphs, which consist of one central vertex and leaf vertices, with each leaf being connected to and from the central vertex. Star graphs are unweighted, undirected, self-loop free graphs, whose degree is linear in the population size. Under uniform initialization, the family of Star graphs is a quadratic amplifier [22, 28]. A generalization of Star graphs, called Superstars [22, 28, 17, 8], are known to be strong amplifiers under uniform initialization [13]. The Superstar family consists of unweighted, self-loop free, but directed graphs where the degree is linear in the population size. Another family of directed graphs with strong amplification properties, called Megastars, was recently introduced in [13]. The Megastars are stronger amplifiers than the Superstars, as the fixation probability on the former is a approximately (ignoring logarithmic factors), and is asymptotically optimal (again, ignoring logarithmic factors). In contrast, the fixation probability on the Superstars is approximately . In the limit of , both families approach the fixation probability 1.

  2. Temperature initialization. While the family of Star graphs is a quadratic amplifier under uniform initialization, it is not even an amplifier under temperature initialization [1]. It was shown in [1] that by adding self-loops and weights to the edges of the Star graph, a graph family, namely the family of Looping Stars, can be constructed, which is a quadratic amplifier simultaneously under temperature and uniform initialization. Note that in contrast to Star graphs, the Looping Star graphs are weighted and also have self-loops.

Open questions. Despite several important existing results on amplifiers of selection, several basic questions have remained open:

  1. Question 1. Does there exist a family of self-loop free graphs (weighted or unweighted) that is a quadratic amplifier under temperature initialization?

  2. Question 2. Does there exist a family of unweighted graphs (with or without self-loops) that is a quadratic amplifier under temperature initialization?

  3. Question 3. Does there exist a family of bounded degree self-loop free (weighted or unweighted) graphs that is a strong amplifier under uniform initialization?

  4. Question 4. Does there exist a family of bounded degree unweighted graphs (with or without self-loops) that is a strong amplifier under uniform initialization?

  5. Question 5. Does there exist a family of graphs that is a strong amplifier under temperature initialization? More generally, does there exist a family of graphs that is a strong amplifier both under temperature and uniform initialization?

To summarize, the open questions ask for (i) the existence of quadratic amplifiers under temperature initialization without the use of self-loops, or weights (Questions 1 and 2); (ii) the existence of strong amplifiers under uniform initialization without the use of self-loops, or weights, and while the degree of the graph is small; and (iii) the existence of strong amplifiers under temperature initialization. While the answers to Question 1 and Question 2 are positive under uniform initialization, they have remained open under temperature initialization. Questions 3 and 4 are similar to 1 and 2, but focus on uniform rather than temperature initialization. The restriction on graphs of bounded degree is natural: large degree means that some individuals must have a lot of interactions, whereas graphs of bounded degree represent simple structures. Question 5 was mentioned as an open problem in [1]. Note that under temperature initialization, even the existence of a cubic amplifier, that achieves fixation probability at least in the limit of large population, has been open [1].

3.2 Results

In this work we present several negative as well as positive results that answer the open questions (Questions 1-5) mentioned above. We first present our negative results.

Negative results. Our main negative results are as follows:

  1. Our first result (Theorem 1) shows that for any self-loop free weighted graph , for any , under temperature initialization the fixation probability is at most . The implication of the above result is that it answers Question 1 in negative.

  2. Our second result (Theorem 2) shows that for any unweighted (with or without self-loops) graph , for any , under temperature initialization the fixation probability is at most . The implication of the above result is that it answers Question 2 in negative.

  3. Our third result (Theorem 3) shows that for any bounded degree self-loop free graph (possibly weighted) , for any , under uniform initialization the fixation probability is at most , where is the bound on the degree, i.e., . The implication of the above result is that it answers Question 3 in negative.

  4. Our fourth result (Theorem 4) shows that for any unweighted, bounded degree graph (with or without self-loops) , for any , under uniform initialization the fixation probability is at most , where is the bound on the degree, i.e., . The implication of the above result is that it answers Question 4 in negative.

Significance of the negative results. We now discuss the significance of the above results.

  1. The first two negative results show that in order to obtain quadratic amplifiers under temperature initialization, self-loops and weights are inevitable, complementing the existing results of [1]. More importantly, it shows a sharp contrast between temperature and uniform initialization: while self-loop free, unweighted graphs (namely, Star graphs) are quadratic amplifiers under uniform initialization, no such graph families are quadratic amplifiers under temperature initialization.

  2. The third and fourth results show that without using self-loops and weights, bounded degree graphs cannot be made strong amplifiers even under uniform initialization. See also Remark 2.

Positive result. Our main positive result shows the following:

  1. For any constant , consider any connected unweighted graph of vertices with self-loops and which has diameter at most . The diameter of a connected graph is the maximum, among all pairs of vertices, of the length of the shortest path between that pair. We establish (Theorem 5) that there is a stochastic weight matrix such that for any the fixation probability on both under uniform and temperature initialization is at least . An immediate consequence of our result is the following: for any family of connected unweighted graphs with self-loops such that the diameter of is at most , for a constant , one can construct a stochastic weight matrix such that the resulting family of weighted graphs is a strong amplifier simultaneously under uniform and temperature initialization. Thus we answer Question 5 in affirmative.

Significance of the positive result. We highlight some important aspects of the results established in this work.

  1. First, note that for the fixation probability of the Moran process on graphs to be well defined, a necessary and sufficient condition is that the graph is connected. A uniformly chosen random connected unweighted graph of vertices has diameter bounded by a constant, with high probability. Hence, within the family of connected, unweighted graphs, the family of graphs of diameter at most , for any constant , has probability measure 1. Our results establish a strong dichotomy: (a) the negative results state that without self-loops and/or without weights, no family of graphs can be a quadratic amplifier (even more so a strong amplifier) even for only temperature initialization; and (b) in contrast, for almost all families of connected graphs with self-loops, there exist weight functions such that the resulting family of weighted graphs is a strong amplifier both under temperature and uniform initialization.

  2. Second, with the use of self-loops and weights, even simple graph structures, such as Star graphs, Grids, and well-mixed structures (i.e., complete graphs) can be made strong amplifiers.

  3. Third, our positive result is constructive, rather than existential. In other words, we not only show the existence of strong amplifiers, but present a construction of them.

Our results are summarized in Table 1.

Remark 1.

Edges with zero weight. Note that edges can be effectively removed by being assigning zero weight (however, no weight assignment can create edges that don’t exist.) Therefore, when our construction works for some graph, it also works for a graph that contains some additional edges. In particular, our construction easily works for complete graphs. The construction can also be extended to a scenario in which we insist that each edge is assigned a positive (non-zero) weight.

Temperature Uniform
Loops No Loops Loops No Loops
Weights
No Weights
Table 1: Summary of our results on existence of strong amplifiers for different initialization schemes (temperature initialization or uniform initialization) and graph families (presence or absence of loops and/or weights). The “” symbol marks that for given choice of initialization scheme and graph family, almost all graphs admit a weight function that makes them strong amplifiers. The “” symbol marks that for given choice of initialization scheme and graph family, no strong amplifiers exist (under any weight function). The asterisk signifies that the negative results under uniform initialization only hold for bounded degree graphs.

4 Preliminaries: Formal Notation

4.1 The Moran Process on Weighted Structured Populations

We consider a population of individuals on a graph . Each individual of the population is either a resident, or a mutant. Mutants are associated with a reproductive rate (or fitness) , whereas the reproductive rate of residents is normalized to . Typically we consider the case where , i.e., mutants are advantageous, whereas when we call the mutants disadvantageous. We now introduce the formal notation related to the process.

Configuration. A configuration of is a subset which specifies the vertices of that are occupied by mutants and thus the remaining vertices are occupied by residents. We denote by the total fitness of the population in configuration , where is the number of mutants in .

The Moran process. The birth-detah Moran process on is a discrete-time Markovian random process. We denote by

the random variable for a configuration at time step

, and and

denote the total fitness and the number of mutants of the corresponding configuration, respectively. The probability distribution for the next configuration

at time is determined by the following two events in succession:

Birth:

One individual is chosen at random to reproduce, with probability proportional to its fitness. That is, the probability to reproduce is for a mutant, and for a resident. Let be the vertex occupied by the reproducing individual.

Death:

A neighboring vertex is chosen randomly with probability . The individual occupying dies, and the reproducing individual places a copy of its own on . Hence, if , then , otherwise .

The above process is known as the birth-death Moran process, where the death event is conditioned on the birth event, and the dying individual is a neighbor of the reproducing one.

Probability measure. Given a graph and the fitness , the birth-death Moran process defines a probability measure on sequences of configurations, which we denote as . If the initial configuration is , then we define the probability measure as , and if the graph and fitness is clear from the context, then we drop the superscript.

Fixation event. The fixation event, denoted , represents that all vertices are mutants, i.e., for some . In particular, denotes the fixation probability in for fitness of the mutant, when the initial mutant is placed on vertex . We will denote this fixation probability as .

4.2 Initialization and Fixation Probabilities

We will consider three types of initialization, namely, (a) uniform initialization, where the mutant arises at vertices with uniform probability, (b) temperature initialization, where the mutant arises at vertices proportional to the temperature, and (c) convex combination of the above two.

Temperature. For a weighted graph , the temperature of a vertex , denoted , is , i.e., the sum of the incoming weights. Note that , and a graph is isothermal iff for all vertices .

Fixation probabilities. We now define the fixation probabilities under different initialization.

  1. Uniform initialization. The fixation probability under uniform initialization is

  2. Temperature initialization. The fixation probability under temperature initialization is

  3. Convex initialization. In -convex initialization, where , the initial mutant arises with probability via uniform initialization, and with probability via temperature initialization. The fixation probability is then

4.3 Strong Amplifier Graph Families

A family of graphs is an infinite sequence of weighted graphs .

  • Strong amplifiers. A family of graphs is a strong uniform amplifier (resp. strong temperature amplifier, strong convex amplifier) if for every fixed and we have that

    where (resp., , ).

Intuitively, strong amplifiers ensures (a) fixation of advantageous mutants with probability 1 and (b) extinction of disadvantageous mutants with probability 1. In other words, strong amplifiers represent the strongest form of amplifiers possible.

5 Negative Results

In the current section we present our negative results, which show the nonexistence of strong amplifiers in the absence of either self-loops or weights. In our proofs, we consider weighted graph , and for notational simplicity we drop the subscripts from vertices, edges and weights, i.e., we write . We also consider that is connected and . Throughout this section we will use a technical lemma, which we present below. Given a configuration with one mutant, let and be the probability that in the next configuration the mutants increase and go extinct, respectively. The following lemma bounds the fixation probability as a function of and .

Lemma 1.

Consider a vertex and the initial configuration where the initial mutant arises at vertex . For any configuration , let

be the probability that the number of mutants increases (or decreases) in a single step. Then the fixation probability from is at most , i.e.,

Proof.

We upperbound the fixation probability starting from by the probability that a configuration is reached with . Note that to reach fixation the Moran process must first reach a configuration with at least two mutants. We now analyze the probability to reach at least two mutants. This is represented by a three-state one dimensional random walk, where two states are absorbing, one absorbing state represents a configuration with two mutants, and the other absorbing state represents the extinction of the mutants, and the bias towards the absorbing state representing two mutants is . See Figure 2 for an illustration. Using the formulas for absorption probability in one-dimensional three-state Markov chains (see, e.g., [18][28, Section 6.3]), we have the probability that a configuration with two mutants is reached is

Hence it follows that . ∎

Figure 2: Illustration of the Markov chain of Lemma 1.

5.1 Negative Result 1

We now prove our negative result 1.

Theorem 1.

For all self-loop free graphs and for every we have .

Proof.

Since is self-loop free, for all we have . Hence . Consider the case where the initial mutant is placed on vertex , i.e, . For any configuration , we have the following:

Thus . Hence by Lemma 1 we have

Summing over all , we obtain

since . Using the Cauchy-Schwarz inequality, we obtain

and thus Section 5.1 becomes

as desired. ∎

We thus arrive at the following corollary.

Corollary 1.

There exists no self-loop free family of graphs which is a strong temperature amplifier.

5.2 Negative Result 2

We now prove our negative result 2.

Theorem 2.

For all unweighted graphs and for every we have .

Proof.

For every vertex , let

We establish two inequalities related to . Since is unweighted, we have

For a vertex , let if has a self-loop and otherwise. Since is connected, each vertex has at least one neighbor other than itself. Thus for every vertex with we have that . Hence

(2)

Similarly to the proof of Theorem 1, the fixation probability given that a mutant is initially placed on vertex is at most

Summing over all , we obtain

since and .

Using the Cauchy-Schwarz inequality we get

where . Note that the function is increasing in for and any . Since , the right-hand side is minimized for , that is

Thus Section 5.2 becomes

as desired.

We thus arrive at the following corollary.

Corollary 2.

There exists no unweighted family of graphs which is a strong temperature amplifier.

5.3 Negative Result 3

We now prove our negative result 3.

Theorem 3.

For all self-loop free graphs with , and for every we have .

Proof.

Let and . For a vertex , denote by . Observe that since , every vertex has an outgoing edge of weight at least , and thus for all . Let . Intuitively, the set contains “hot” vertices, since each vertex is replaced frequently (with rate at least ) by at least one neighbor .

Bound on size of . We first obtain a bound on the size of . Consider a vertex and a vertex (i.e., ). For every vertex such that we can count and to avoide multiple counting, we consider for each count of a contribution of , which is at least due to the degree bound. Hence we have

where the last inequality follows from the fact that for all . Hence the probability that the initial mutant is a vertex in has probability at least according to the uniform initialization.

Bound on probability. Consider that the initial mutant is a vertex . Consider any configuration , we have the following:

Thus . Hence by Lemma 1 we have

Finally, we have

The desired result follows. ∎

We thus arrive at the following corollary.

Corollary 3.

There exists no self-loop free, bounded-degree family of graphs which is a strong uniform amplifier.

5.4 Negative Result 4

We now prove our negative result 4.

Theorem 4.

For all unweighted graphs with , and for every we have .

Proof.

Let and consider that for some . Consider any configuration , we have the following:

Thus . By Lemma 1 we have

Finally, we have

The desired result follows. ∎

We thus arrive at the following corollary.

Corollary 4.

There exists no unweighted, bounded-degree family of graphs which is a strong uniform amplifier.

Remark 2.

Theorems 4 and 3 establish the nonexistence of strong amplification with bounded degree graphs. A relevant result can be found in [24], which establishes an upperbound of the fixation probability of mutants under uniform initialization on unweighted, undirected graphs. If the bounded degree restriction is relaxed to bounded average degree, then recent results show that strong amplifiers (called sparse incubators) exist [15].

6 Positive Result

In the previous section we showed that self-loops and weights are necessary for the existence of strong amplifiers. In this section we present our positive result, namely that every family of undirected graphs with self-loops and whose diameter is not “too large” can be made a strong amplifier by using appropriate weight functions. Our result relies on several novel conceptual steps, therefore the proof is structured in three parts.

  1. First, we introduce some formal notation that will help with the exposition of the ideas that follow.

  2. Second, we describe an algorithm which takes as input an undirected graph of vertices, and constructs a weight matrix to obtain the weighted graph .

  3. Lastly, we prove that is a strong amplifier both for uniform and temperature initialization.

Before presenting the details we introduce some notation to be used in this section.

6.1 Undirected Graphs and Notation

We first present some additional notation required for the exposition of the results of this section.

Undirected graphs. Our input is an unweighted undirected graph with self loops. For ease of notation, we drop the subscript and refer to the graph instead. Since is undirected, for all vertices we have , and we denote by the set of neighbors of vertex . Hence, iff . Moreover, since has self-loops, we have . Also we consider that is connected, i.e., for every pair of vertices , there is a path from to .

Symmetric weight function. So far we have used a stochastic weight matrix , where for every we have . In this section, we will consider a weight function , and given a vertex we denote by . Our construction will not only assign weights, but also ensure symmetry. In other words, we we construct symmetric weights such that for all we have . Given such a weight function , the corresponding stochastic weight matrix is defined as for all pairs of vertices . Given a unweighted graph and weight function , we denote by the corresponding weighted graph.

Vertex-induced subgraphs. Given a set of vertices , we denote by the subgraph of induced by , where , and the weight function defined as

In words, the weights on the edges of to vertices that do not belong to are added to the self-loop weight of . Since the sum of all weights does not change, we have for all . The temperature of in is

6.2 Algorithm for Weight Assignment on

We start with the construction of the weight function on . Since we consider arbitrary input graphs, is constructed by an algorithm. The time complexity of the algorithm is . Since our focus is on the properties of the resulting weighted graph, we do not explicitly analyze the time complexity.

Steps of the construction. Consider a connected graph with diameter , where is a constant independent of . We construct a weight function such that whp an initial mutant arising under uniform or temperature initialization, eventually fixates on . The weight assignment consists of the following conceptual steps.

  1. Spanning tree construction and partition. First, we construct a spanning tree of rooted on some arbitrary vertex . In words, a spanning tree of an undirected graph is a connected subgraph that is a tree and includes all of the vertices of the graph. Then we partition the tree into a number of component trees of appropriate sizes.

  2. Hub construction. Second, we construct the hub of , which consists of the vertices that are roots of the component trees, together with all vertices in the paths that connect each to the root of . All vertices that do not belong to the hub belong to the branches of .

  3. Weight assignment. Finally, we assign weights to the edges of , such that the following properties hold:

    1. The hub is an isothermal graph, and evolves exponentially faster than the branches.

    2. All edges between vertices in different branches are effectively cut-out (by being assigned weight ).

In the following we describe the above steps formally.

Spanning tree construction and partition. Given the graph , we first construct a spanning tree using the standard breadth-first-search (BFS) algorithm. Let be such a spanning tree of , rooted at some arbitrary vertex . We now construct the partitioning as follows: We choose a constant , and pick a set such that

  1. , and

  2. the removal of splits into trees , each rooted at vertex and of size , with the property that for all .

The set is constructed by a simple bottom-up traversal of in which we keep track of the size of the subtree marked by the current vertex and the vertices already in . Once , we add to and proceed as before. Since every time we add a vertex to we have , it follows that . Additionally, the subtree rooted in every child of has size at most , otherwise that child of would have been chosen to be included in instead of .

Hub construction: hub . Given the set of vertices constructed during the spanning tree partitioning, we construct the set of vertices called the hub, as follows:

  1. We choose a constant .

  2. For every vertex , we add in every vertex that lies in the unique simple path between the root of and (including and ). Since and , we have that .

  3. We add extra vertices to , such that in the end, the vertices of form a connected subtree of (rooted in ). This is simply done by choosing a vertex and a neighbor of with , and adding to , until contains vertices.

Branches . The hub defines a number of trees , where each tree is rooted at a vertex adjacent to , and has vertices. We will refer to these trees as branches(see Figure 3).

Proposition 1.

Note that by construction, we have for every , and , and .

Figure 3: Illustration of the hub and the branches .

Notation. To make the exposition of the ideas clear, we rely on the following notation.

  1. Parent and ancestors . Given a vertex , we denote by the parent of in and by the set of ancestors of .

  2. Children and descendants . Given a vertex that is not a leaf in , we denote by the children of in that do not belong to the hub , and by the set of descendants of in that do not belong to the hub .

Frontier, distance, and branches. We present few notions required for the weight assignment:

  1. Frontier . Given the hub , the frontier of is the set of vertices defined as

    In words, contains all vertices of that have a neighbor not in .

  2. Distance function . For every vertex , we define its distance to be the length of the shortest path in to some vertex (e.g., if , we have (i) , and (ii) for every we have ).

  3. Values and . For every vertex , we define i.e., is the number of neighbors of that belong to the hub (excluding itself). Let

Weight assignment. We are now ready to define the weight function .

  1. For every edge such that and and and are not neighbors in , we assign .

  2. For every vertex we assign .

  3. For every vertex we assign .

  4. For every vertex we assign .

  5. For every edge such that and we assign .

  6. For every remaining edge such that we assign