Fixation Maximization in the Positional Moran Process

The Moran process is a classic stochastic process that models invasion dynamics on graphs. A single "mutant" (e.g., a new opinion, strain, social trait etc.) invades a population of residents spread over the nodes of a graph. The mutant fitness advantage δ≥ 0 determines how aggressively mutants propagate to their neighbors. The quantity of interest is the fixation probability, i.e., the probability that the initial mutant eventually takes over the whole population. However, in realistic settings, the invading mutant has an advantage only in certain locations. E.g., a bacterial mutation allowing for lactose metabolism only confers an advantage on places where dairy products are present. In this paper we introduce the positional Moran process, a natural generalization in which the mutant fitness advantage is only realized on specific nodes called active nodes. The associated optimization problem is fixation maximization: given a budget k, choose a set of k active nodes that maximize the fixation probability of the invading mutant. We show that the problem is NP-hard, while the optimization function is not submodular, thus indicating strong computational hardness. Then we focus on two natural limits. In the limit of δ→∞ (strong selection), although the problem remains NP-hard, the optimization function becomes submodular and thus admits a constant-factor approximation using a simple greedy algorithm. In the limit of δ→ 0 (weak selection), we show that in O(m^ω) time we can obtain a tight approximation, where m is the number of edges and ω is the matrix-multiplication exponent. Finally, we present an experimental evaluation of the new algorithms together with some proposed heuristics.

READ FULL TEXT VIEW PDF

page 1

page 2

page 3

page 4

01/20/2022

Invasion Dynamics in the Biased Voter Process

The voter process is a classic stochastic process that models the invasi...
04/23/2015

First Steps Towards a Runtime Comparison of Natural and Artificial Evolution

Evolutionary algorithms (EAs) form a popular optimisation paradigm inspi...
06/25/2020

APX-Hardness and Approximation for the k-Burning Number Problem

Consider an information diffusion process on a graph G that starts with ...
06/21/2017

Faster Monte-Carlo Algorithms for Fixation Probability of the Moran Process on Undirected Graphs

Evolutionary graph theory studies the evolutionary dynamics in a populat...
01/14/2019

Kelly Cache Networks

We study networks of M/M/1 queues in which nodes act as caches that stor...
05/10/2021

On the Hardness of Opinion Dynamics Optimization with L_1-Budget on Varying Susceptibility to Persuasion

Recently, Abebe et al. (KDD 2018) and Chan et al. (WWW 2019) have consid...
02/07/2018

Strong Amplifiers of Natural Selection: Proofs

We consider the modified Moran process on graphs to study the spread of ...

1 Introduction

Several real-world phenomena are described in terms of an invasion process that occurs in a network and follows some propagation model. The problems considered in such settings aim to optimize the success of the invasion, quantified by some metric. For example, social media campaigns serving purposes of marketing, political advocacy, and social welfare are modeled in terms of influence spread, whereby the goal is to maximize the expected influence spread of a message as a function of a set of initial adopters Kempe, Kleinberg, and Tardos (2003); Tang, Xiao, and Shi (2014); Borgs et al. (2014); Tang, Shi, and Xiao (2015); Zhang et al. (2020); see Li et al. (2018) for a survey. Similarly, in rumor propagation Demers et al. (1987), the goal is to minimize the number of rounds until a rumor spreads to the whole population Fountoulakis, Panagiotou, and Sauerwald (2012).

A related propagation model is that of the Moran process, introduced as a model of genetic evolution Moran (1958), and the variants thereof, such as the discrete Voter model Clifford and Sudbury (1973); Liggett (1985); Antal, Redner, and Sood (2006); Talamali et al. (2021). A single mutant (e.g., a new opinion, strain, social trait, etc.) invades a population of residents scattered across the nodes of the graph. Contrariwise to influence and rumor propagation models, the Moran process accounts for active resistance to the invasion: a node carrying the mutant trait may not only forward it to its neighbors, but also lose it by receiving the resident trait from its neighbors. This model exemplifies settings in which individuals may switch opinions several times or sub-species competing in a gene pool. A key parameter in this process is the mutant fitness advantage, a real number  that expresses the intensity by which mutants propagate to their neighbors compared to residents. Large  favors mutants, while  renders mutants and residents indistinguishable.

Success in the Moran process is quantified in terms of the fixation probability, i.e., the probability that a randomly occurring mutation takes over the whole population. As the focal model in evolutionary graph theory Lieberman, Hauert, and Nowak (2005), the process has spawned extensive work on identifying amplifiers, i.e., graph structures that enhance the fixation probability Monk, Green, and Paulin (2014); Giakkoupis (2016); Galanis et al. (2017); Pavlogiannis et al. (2018); Tkadlec et al. (2021). Still, in the real world, the probability that a mutation spreads to its neighborhood is often affected by its position: the ability to metabolise a certain sugar is advantageous to bacteria only when that sugar is present in their surroundings; likewise, people are likely to spread a viewpoint more enthusiastically if it is supported by experience from their local environment (country, city, neighborhood). By default, the Moran process does not account for such positional effects. While optimization questions have been studied in the Moran and related models Even-Dar and Shapira (2007), the natural setting of positional advantage remains unexplored. In this setting, the question arises: to which positions (i.e., nodes) should we confer an advantage so as to maximize the fixation probability?

Our contributions. We define a positional variant of the Moran process, in which the mutant fitness advantage is only realized on a subset of nodes called active nodes and an associated optimization problem, fixation maximization (FM): given a graph , a mutant fitness advantage  and a budget , choose a set of  nodes to activate that maximize the fixation probability  of mutants. We also study the problems  and  that correspond to FM at the natural limits  and , respectively. On the negative side, we show that FM and are  (Section 3.1). One common way to circumvent -hardness is to prove that the optimization function is submodular; however, we show that is not submodular when  is finite (Section 3.2). On the positive side, we first show that the fixation probability on unidrected graphs admits a FPRAS, that is, it can be approximated to arbitrary precision in polynomial time (Section 4.1). We then show that, in contrast to finite , the function is submodular on undirected graphs and thus admits a polynomial-time constant-factor approximation (Section 4.2). Lastly, regarding the limit , we show that can be solved in polynomial time on any graph (Section 4.3). Overall, while FM is hard in general, we obtain tractability in both natural limits and .

Due to space constraints, some proofs are in the Appendix A.

2 Preliminaries

We extend the standard Moran process to its positional variant and define our fixation maximization problem on it.

2.1 The Positional Moran Process

Structured populations. Standard evolutionary graph theory Nowak (2006) describes a population structure by a directed weighted graph (network)  of  nodes, where  is a weight function over the edge set 

that defines a probability distribution on each node 

, i.e., for each , . We require that  is strongly connected when  is projected to the support of the probability distribution of each node. Nodes represent sites (locations) and each site is occupied by a single agent. Agents are of two types — mutants and residents — and agents of the same type are indistinguishable from each other. For ease of presentation, we may refer to a node and its occupant agent interchangeably. In special cases, we will consider simple undirected graphs, meaning that (i)  is symmetric and (ii)  is uniform for every node .

Active node sets and fitness. Mutants and residents are differentiated in fitness, which in turn depends on a set  of active nodes. Consider that mutants occupy the nodes in . The fitness of the agent at node  is:

(1)

where is the mutant fitness advantage. Intuitively, fitness expresses the capacity of a type to spread in the network. The resident fitness is normalized to , while  measures the relative competitive advantage conferred by the mutation. However, the mutant fitness advantage is not realized uniformly in the graph, but only at active nodes. The total fitness of the population is .

The positional Moran process. We introduce the positional Moran process as a discrete-time stochastic process , where each

is a random variable denoting the set of nodes of

occupied by mutants. Initially, the whole population consists of residents; at time , a single mutant appears uniformly at random. That is, is given by for each . In each subsequent step, we obtain from by executing the following two stochastic events in succession.

  1. Birth event: A single agent is selected for reproduction with probability proportional to its fitness, i.e., an agent on node  is chosen with probability .

  2. Death event: A neighbor  of  is chosen with probability ; its agent is replaced by a copy of the one at .

If  is a mutant and  a resident, then mutants spread to , hence ; likewise, if  is a resident and  a mutant, then ; otherwise and  are of the same type, and thus .

Related Moran processes. Our positional Moran process generalizes the standard Moran process on graphs Nowak (2006) that forms the special case of  (that is, the mutant fitness advantage holds on the whole graph). Conversely, it can be viewed a special case of the Moran process with two environments Kaveh, McAvoy, and Nowak (2019). It is different from models in which agent’s fitness depends on which agents occupy the neighboring nodes (so-called frequency dependent fitness) Huang and Traulsen (2010).

Fixation probability. If such that , we say that the mutants fixate in , otherwise we say they go extinct. As  is strongly connected, for any initial mutant set , the process reaches a homogeneous state almost surely, i.e., . The fixation probability (FP) of such a set  is . Thus, the FP of the mutants in  with active node set  and fitness advantage  is:

(2)

For any fixed  and , the function  is continuous in 

and infinitely differentiable: the underlying Markov chain is finite and its transition probabilities are rational functions in 

, so  too is a rational function in , and since the process is defined for any , has no points of discontinuity for .

Computing the fixation probability. In the neutral setting where , for any graph  Broom et al. (2010). When , the complexity of computing the fixation probability is unknown even when ; it is open whether the function can be computed in . For undirected  and , the problem has a fully polynomial randomized approximation scheme (FPRAS) Díaz et al. (2014); Chatterjee, Ibsen-Jensen, and Nowak (2017); Ann Goldberg, Lapinskas, and Richerby (2020), as the expected absorption time of the process is polynomial and thus admits fast Monte-Carlo simulations. We also rely on Monte-Carlo simulations for computing , and show that the problem admits a FPRAS for any  when  is undirected.

Figure 1: Steps in the positional Moran process. Blue (red) nodes are occupied by mutants (residents). Yellow rings indicate active nodes that realize the mutant fitness advantage.

2.2 Fixation Maximization via Node Activation

We study the optimization problem of fixation maximization in the positional Moran process. We focus on three variants based on different regimes of the mutant fitness advantage .

Fixation maximization. Consider a graph , a mutant fitness advantage  and a budget . The problem Fixation Maximization for the Positional Moran Process calls to determine an active-node set of size  that maximizes the fixation probability:

As we will argue, is monotonically increasing in , hence the condition  can be replaced by . The associated decision problem is: Given a budget  and a threshold , determine whether there exists an active-node set  of size  such that . As an illustration, consider a cycle graph on nodes and two strategies: one that activates  “spaced” nodes, and another that activates  “contiguous” nodes. As Fig. 2 shows, depending on  and , either strategy could be better.

The regime of strong selection. Next, we consider FM in the limit . The corresponding fixation probability is:

(4)

As we show later, is monotonically increasing in , and since it is also bounded by 1, the above limit exists. The corresponding optimization problem asks for the active node set  that maximizes :

Figure 2: FP differences on a cycle graph on nodes, under spaced and contiguous activating strategy. E.g. for and we have and , so the difference is almost .

The regime of weak selection. Further, we consider the problem in the limit . For small positive , different sets  generally give different fixation probabilities , see Fig. 3. For fixed , we can view  as a function of . By the Taylor expansion of , and given that , we have

(6)

where . Since lower-order terms  tend to  faster than  as , on a sufficiently small neighborhood of , maximizing  reduces to maximizing the derivative  (up to lower order terms). The corresponding optimization problem  asks for the set  that maximizes :

As for all , our choice of  maximizes the gain of fixation probability, i.e., the difference . The set  of Section 2.2 guarantees a gain with relative approximation factor tending to  as , i.e., we have the following (simple) lemma.

Lemma 1.

.

Figure 3: A fixed graph  on 7 nodes and fixation probability  for all six non-equivalent subsets  of  nodes; is roughly linear in  when .

Monotonicity. Enlarging the set  or the advantage  increases the fixation probability. The proof generalizes the standard Moran process monotonicity Díaz et al. (2016).

Lemma 2 (Monotonicity).

Consider a graph , two subsets and two real numbers . Then .

3 Negative Results

We start by presenting hardness results for FM and .

3.1 Hardness of Fm and

We first show -hardness for FM and . This hardness persists even given an oracle that computes the fixation probability (or ) for a graph  and a set of active nodes . Given a set of mutants , let

(8)

be the fixation probability of a set of mutants under strong selection, i.e., in the limit of large fitness advantage. The following lemma states that for undirected graphs, a single active mutant with infinite fitness guarantees fixation.

Lemma 3.

Let be an undirected graph, a set of active nodes, and a set of mutants. If then .

The intuition behind Lemma 3 is as follows: consider a node . Then, as , node  is chosen for reproduction at a much higher rate than any resident node. Hence, in a typical evolutionary trajectory, the neighbors of node are occupied by mutants for most of the time and thus the individual at node  is effectively guarded from any threat of becoming a resident. In contrast, the mutants are reasonably likely to spread to any other part of the graph. In combination, this allows us to argue that mutants fixate with high probability as . Notably, this argument relies on  being undirected. As a simple counterexample, consider  being a directed triangle  and . Then, for the mutants to spread to , a mutant on  must reproduce. Since , a mutant on  is as likely to reproduce in each round as a resident on . If the latter happens prior to the former, node  becomes resident, and, with no active mutants left, we have a constant (non-zero) probability of extinction.

Due to Lemma 3, if  is an undirected regular graph with a vertex cover of size , under strong selection, we achieve the maximum fixation probability by activating a set  of  nodes that forms a vertex cover of . The key insight is that, if the initial mutant lands on some node , then the mutant fixates if and only if it manages to reproduce once before it is replaced, as all neighbors of  are in  and thus Lemma 3 applies. This is captured in the following lemma.

Lemma 4.

For any undirected regular graph and , iff is a vertex cover of (and if so, the equality holds).

Proof.

First, note that due to Lemma 3, the fixation probability equals the probability that eventually a mutant is placed on an active node. Due to the uniform placement of the initial mutant, the probability that it lands on an active node is . Let be the set of nodes in  that have at least one neighbor not in . Thus we can write:

(9)

where  is the probability that an initial mutant occupying a non-active node  whose neighbors are all in  spreads to any of its neighbors, i.e., reproduces before any of its neighbors replace it; reproduces with probability , while its neighbors replace it with probability , where  is the degree of , hence . If  is a vertex cover of , then , hence Eq. 9 yields . If  is not a vertex cover of , then ; for any node , since at least one of its neighbors is not in , is strictly smaller than the probability that the mutant on  reproduces before it gets replaced, which, as we argued, is , hence Eq. 9 yields . ∎

The hardness for is a direct consequence of Lemma 4 and the fact that vertex cover is even on regular graphs (see, e.g., Feige (2003)). Moreover, as is a continuous function on , we also obtain hardness for FM (i.e., under finite ). We thus have the following theorem.

Theorem 1.

and are , even on undirected regular graphs.

Finally, we note that the hardness persists even given an oracle that computes the fixation probability (or ) given a graph  and a set of active nodes .

3.2 Non-Submodularity for Fm

One standard way to circumvent -hardness is to show that the optimization function is submodular, which implies that a greedy algorithm offers constant-factor approximation guarantees Nemhauser, Wolsey, and Fisher (1978); Krause and Golovin (2014). Although in the next section we prove that is indeed submodular, unfortunately the same does not hold for , even for very simple graphs.

Theorem 2.

The function is not submodular.

Proof.

Our proof is by means of a counter-example. Consider , that is, is a clique on  nodes. For denote by the fixation probability on when of the 4 nodes are active and mutants have fitness advantage . For fixed , the value can be computed exactly by solving a system of linear equations (one for each configuration of mutants and residents). Solving the three systems using a computer algebra system, we obtain , and . Hence , thus submodularity is violated. ∎

For , the three systems can even be solved symbolically, in terms of . It turns out that the submodularity property is violated for , where . In contrast, for the submodularity property holds for all (the verification is straightforward, though tedious).

4 Positive Results

We now turn our attention to positive results for fixation maximization in the limits of strong and weak selection.

4.1 Approximating the Fixation Probability

We first focus on computing the fixation probability. Although the complexity of the problem is open even in the standard Moran process, when the underlying graph is undirected, the expected number of steps until the standard (non-positional) process terminates is polynomial in  Díaz et al. (2014), which yields a FPRAS. A straightforward extension of that approach applies to the expected number of steps in the positional Moran process.

Lemma 5.

Given an undirected graph on nodes, a set and a real number , we have .

As a consequence, by simulating the process multiple times and reporting the proportion of runs that terminated with mutant fixation, we obtain a fully polynomial randomized approximation scheme (FPRAS) for the fixation probability.

Corollary 1.

Given a connected undirected graph , a set and a real number , the function admits a FPRAS.

4.2 Fixation Maximization under Strong Selection

Here we show that the function is submodular. As a consequence, we obtain a constant-factor approximation for in polynomial time.

Lemma 6 (Submodularity).

For any undirected graph , the function is submodular.

Proof.

Consider any set of active nodes. Then the positional Moran process with mutant advantage is equivalent to the following process:

  1. While , perform an update step as if .

  2. If , terminate and report fixation.

Indeed, as long as no active node hosts a mutant, all individuals have the same (unit) fitness and the process is indistinguishable from the Moran process with . On the other hand, once any one active node receives a mutant, fixation happens with high probability by Lemma 3. All in all, the fixation probability in the limit can be computed by simulating the neutral process () until either (extinction) or (fixation).

To prove the submodularity of , it suffices to show that for any two sets we have:

(10)

Consider any fixed trajectory . We say that a subset is good with respect to , or simply good, if there exists a time point  such that . It suffices to show that, for any , there are at least as many good sets among , as they are among and . To that end, we distinguish three cases based on how many of the sets , are good.

  1. Both of them: Since is good, both and are good and we get .

  2. One of them: If is good we conclude as before (we get ). Otherwise is good, hence at least one of , is good and we get .

  3. None of them: We have . ∎

The submodularity lemma leads to the following approximation guarantee for the greedy algorithm Nemhauser, Wolsey, and Fisher (1978); Krause and Golovin (2014).

Theorem 3.

Given an undirected graph  and integer , let  be the solution to , and  the set returned by a greedy maximization algorithm. Then .

Consequently, a greedy algorithm approximates the optimal fixation probability within a factor of 

, provided it is equipped with an oracle that estimates the fixation probability

given any set . For the class of undirected graphs, Lemma 5 and Theorem 3 yield a fully polynomial randomized greedy algorithm for with approximation guarantee .

4.3 Fixation Maximization under Weak Selection

Here we show that can be solved in polynomial time. Recall that given an active set , we can write , while calls to compute a set  which maximizes the coefficient across all sets  that satisfy . We build on the machinery developed in Allen et al. (2017).

In Lemma 7 below, we show that for a certain function . The bottleneck in computing is solving a system of linear equations, one for each pair of nodes. This can be done in matrix-multiplication time for  Alman and Williams (2021). By virtue of the linearity of , the optimal set  is then given by choosing the top- nodes  in terms of . Thus we establish the following theorem.

Theorem 4.

Given a graph and some integer , the solution to can be computed in time, where is the matrix multiplication exponent.

In the rest of this section, we state Lemma 7 and give intuition about the function . For the formal proof of Lemma 7, see Appendix A.

For brevity, we denote the nodes of by and we denote by  the probability that, if  is selected for reproduction, then the offspring migrates to .

Lemma 7.

Let be a graph with nodes . Consider a function defined by , where is the solution to the linear system given by together with

(11)

and is the solution to the linear system

(12)

Then .

The intuition behind the quantities , , and from Lemma 7 is as follows: Consider the Moran process on with . Then for , the value  is in fact the mutant fixation probability starting from the initial configuration  Allen et al. (2021). Indeed, at any time step, the agent on  will eventually fixate if (i)  is not replaced by its neighbors and eventually fixates from the next step onward, or (ii)  spreads to a neighbor  and fixates from there. The first event happens with rate , while the second event happens with rate . (We note that for undirected graphs, the system has an explicit solution  Broom et al. (2010).)

The values for also have intuitive meaning: They are equal to the expected total number of steps during the Moran process (with ) in which is mutant and is not, when starting from a random initial configuration . To sketch the idea behind this claim, suppose that in a single step, a node spreads to node (this happens with rate ). Then the event “ is mutant while is not” holds if and only if was mutant but was not, hence the product . Similar reasoning applies to the product . The term  in the numerator comes out of the probability that the initial mutant lands on (in which case indeed is mutant and is not).

Finally, the function also has a natural interpretation: The contribution of an active node  to the fixation probability grows vs.  at to the extent that a mutant  is likely to be chosen for reproduction, spread to a resident neighbor , and fixate from there; that is, by the total time that  is mutant but a neighbor  is not (), calibrated by the probability that, when chosen for reproduction,  propagates to that neighbor (), who thereafter achieves mutant fixation (); this growth rate is summed over all neighbors and weighted by the rate  at which  reproduces when .

5 Experiments

Here we report on an experimental evaluation of the proposed algorithms and some other heuristics. Our data set consists of 110 connected subgraphs of community graphs and social networks of the Stanford Network Analysis Project Leskovec and Krevl (2014). These subgraphs where chosen randomly, and varied in size between 20-170 nodes. Our evaluation is not aimed to be exhaustive, but rather to outline the practical performance of various heuristics, sometimes in relation to their theoretical guarantees.

In particular, we use 7 heuristics, each taking as input a graph, a budget , and (optionally) the mutant fitness advantage . The heuristics are motivated by our theoretical results and by the related fields of influence maximization and evolutionary graph theory.

Figure 4: Heuristic performance for .
Figure 5: Heuristic performance for .
  1. Random: Simply choose  nodes uniformly at random. This heuristic serves as a baseline.

  2. High Degree: Choose the  nodes with the largest degree.

  3. Centrality: Choose the  nodes with the largest betweenness centrality.

  4. Temperature: The temperature of a node  is defined as ; this heuristic chooses the  nodes with the largest temperature.

  5. Vertex Cover: Motivated by Lemma 4, this heuristic attempts to maximize the number of edges with at least one endpoint in . In particular, given a set , let

    (13)

    The heuristic greedily maximizes , i.e., we start with and perform update steps

    (14)
  6. Weak Selector: Choose the  nodes that maximize using the (optimal) weak-selection method.

  7. Lazy Greedy: A simple greedy algorithm starts with , and in each step chooses the node  to add to  that maximizes the objective function. For and this process requires to repeatedly evaluate (or ) for every node . This is done by simulating the process a large number of times (recall Lemma 5), and becomes a computational bottleneck when we require high precision. As a workaround we suggest a lazy variant of the greedy algorithm Minoux (1978), which is faster but requires submodularity of the objective function. In effect, this algorithm is a correct implementation of the greedy heuristic in the limit of strong selection (recall Lemma 6), while it may still perform well (but without guarantees) for finite .

In all cases ties are broken arbitrarily. Note that the heuristics vary in the amount of information they have about the graph and the invasion process. In particular, Random has no information whatsoever, High Degree only considers the direct neighbors of a node, Temperature considers direct and distance-2 neighbors of a node, Centrality and Vertex Cover consider the whole graph, while Weak Selector and Lazy Greedy are the only ones informed about the Moran process.

For each graph , we have chosen values of  corresponding to , and of its nodes, and have evaluated the above heuristics in their ability to solve (strong selection) and (weak selection). We have not considered other values of  as evaluating precisely via simulations requires many repetitions and becomes slow.

Strong selection. We start with the case of . Since different graphs and budgets generally result in highly variant fixation probabilities, in order to get an informative aggregate measure of each heuristic, we divide the fixation probability it obtains by the maximum fixation probability obtained across all heuristics for the same graph and budget. This normalization yields values in the interval , and makes comparison straightforward. Fig. 4 shows our results. We see that the Lazy Greedy algorithm performs best for small budgets (, ), while its performance is matched by Vertex Cover for budget . The high performance of Lazy Greedy is expected given its theoretical guarantee (Theorem 3

). We also observe that, apart from Weak Selector, the other heuristics perform quite well on many graphs, though there are several cases on which they fail, appearing as outlier points in the box plots. As expected, our baseline Random heuristic performs quite poorly; this result indicates that the node activation set 

has a significant impact on the fixation probability. Finally, recall that Weak Selector is optimal for the regime of weak selection (). The fact that Weak Selector underperforms for strong selection () indicates an intricate relationship between fitness advantage and fixation probability.

Weak selection. We collect the results on weak selection in Fig. 5, using the normalization process above. Since the derivative satisfies (Section 4.3), Lazy Greedy is optimal and coincides with Weak Selector, and is thus omitted from the figure. Naturally, the Weak Selector always outperforms others, while the Random heuristic is weak. The other heuristics have mediocre performance, with a clear advantage of Centrality over the rest, which becomes clearer for larger budget values.

6 Conclusion

We introduced the positional Moran process and studied the associated fixation maximization problem. We have shown that the problem is  in general, but becomes tractable in the limits of strong and weak selection. Our results only scratch the surface of this new process, as several interesting questions are open, such as: Does the strong-selection setting admit a better approximation than the one based on submodularity? Can the problem for finite  be approximated within some constant-factor? Are there classes of graphs for which it becomes tractable?

References

  • Allen et al. (2017) Allen, B.; Lippner, G.; Chen, Y.-T.; Fotouhi, B.; Momeni, N.; Yau, S.-T.; and Nowak, M. A. 2017. Evolutionary dynamics on any population structure. Nature, 544(7649): 227–230.
  • Allen et al. (2021) Allen, B.; Sample, C.; Steinhagen, P.; Shapiro, J.; King, M.; Hedspeth, T.; and Goncalves, M. 2021. Fixation probabilities in graph-structured populations under weak selection. PLOS Computational Biology, 17(2): 1–25.
  • Alman and Williams (2021) Alman, J.; and Williams, V. V. 2021. A Refined Laser Method and Faster Matrix Multiplication. In Marx, D., ed., Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms, SODA 2021, Virtual Conference, January 10 - 13, 2021, 522–539. SIAM.
  • Ann Goldberg, Lapinskas, and Richerby (2020) Ann Goldberg, L.; Lapinskas, J.; and Richerby, D. 2020. Phase transitions of the Moran process and algorithmic consequences. Random Structures & Algorithms, 56(3): 597–647.
  • Antal, Redner, and Sood (2006) Antal, T.; Redner, S.; and Sood, V. 2006. Evolutionary Dynamics on Degree-Heterogeneous Graphs. Phys. Rev. Lett., 96: 188104.
  • Borgs et al. (2014) Borgs, C.; Brautbar, M.; Chayes, J.; and Lucier, B. 2014. Maximizing Social Influence in Nearly Optimal Time. In SODA, 946–957.
  • Broom et al. (2010) Broom, M.; Hadjichrysanthou, C.; Rychtář, J.; and Stadler, B. 2010. Two results on evolutionary processes on general non-directed graphs. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 466(2121): 2795–2798.
  • Chatterjee, Ibsen-Jensen, and Nowak (2017) Chatterjee, K.; Ibsen-Jensen, R.; and Nowak, M. A. 2017. Faster Monte-Carlo Algorithms for Fixation Probability of the Moran Process on Undirected Graphs. In Larsen, K. G.; Bodlaender, H. L.; and Raskin, J.-F., eds., 42nd International Symposium on Mathematical Foundations of Computer Science (MFCS 2017), volume 83 of Leibniz International Proceedings in Informatics (LIPIcs), 61:1–61:13. Dagstuhl, Germany: Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik. ISBN 978-3-95977-046-0.
  • Clifford and Sudbury (1973) Clifford, P.; and Sudbury, A. 1973. A Model for Spatial Conflict. Biometrika, 60(3): 581–588.
  • Demers et al. (1987) Demers, A.; Greene, D.; Hauser, C.; Irish, W.; Larson, J.; Shenker, S.; Sturgis, H.; Swinehart, D.; and Terry, D. 1987. Epidemic Algorithms for Replicated Database Maintenance. In Proceedings of the Sixth Annual ACM Symposium on Principles of Distributed Computing, PODC ’87, 1–12. New York, NY, USA: Association for Computing Machinery. ISBN 089791239X.
  • Díaz et al. (2014) Díaz, J.; Goldberg, L. A.; Mertzios, G. B.; Richerby, D.; Serna, M.; and Spirakis, P. G. 2014. Approximating fixation probabilities in the generalized moran process. Algorithmica, 69(1): 78–91.
  • Díaz et al. (2016) Díaz, J.; Goldberg, L. A.; Richerby, D.; and Serna, M. 2016. Absorption time of the Moran process. Random Structures & Algorithms, 49(1): 137–159.
  • Even-Dar and Shapira (2007) Even-Dar, E.; and Shapira, A. 2007. A Note on Maximizing the Spread of Influence in Social Networks. In Deng, X.; and Graham, F. C., eds., Internet and Network Economics, 281–286. Berlin, Heidelberg: Springer Berlin Heidelberg. ISBN 978-3-540-77105-0.
  • Feige (2003) Feige, U. 2003. Vertex cover is hardest to approximate on regular graphs. Technical Report MCS03-15, The Weizmann Institute of Science.
  • Fountoulakis, Panagiotou, and Sauerwald (2012) Fountoulakis, N.; Panagiotou, K.; and Sauerwald, T. 2012. Ultra-Fast Rumor Spreading in Social Networks. In Proceedings of the Twenty-Third Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’12, 1642–1660. USA: Society for Industrial and Applied Mathematics.
  • Galanis et al. (2017) Galanis, A.; Göbel, A.; Goldberg, L. A.; Lapinskas, J.; and Richerby, D. 2017. Amplifiers for the Moran process. Journal of the ACM (JACM), 64(1): 5.
  • Giakkoupis (2016) Giakkoupis, G. 2016. Amplifiers and Suppressors of Selection for the Moran Process on Undirected Graphs. arXiv preprint arXiv:1611.01585.
  • Huang and Traulsen (2010) Huang, W.; and Traulsen, A. 2010. Fixation probabilities of random mutants under frequency dependent selection. Journal of Theoretical Biology, 263(2): 262–268.
  • Kaveh, McAvoy, and Nowak (2019) Kaveh, K.; McAvoy, A.; and Nowak, M. A. 2019. Environmental fitness heterogeneity in the Moran process. Royal Society open science, 6(1): 181661.
  • Kempe, Kleinberg, and Tardos (2003) Kempe, D.; Kleinberg, J.; and Tardos, E. 2003. Maximizing the Spread of Influence through a Social Network. In Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 137–146.
  • Kötzing and Krejca (2019) Kötzing, T.; and Krejca, M. S. 2019. First-hitting times under drift. Theoretical Computer Science, 796: 51–69.
  • Krause and Golovin (2014) Krause, A.; and Golovin, D. 2014. Submodular function maximization. Tractability, 3: 71–104.
  • Leskovec and Krevl (2014) Leskovec, J.; and Krevl, A. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. /urlhttp://snap.stanford.edu/data.
  • Li et al. (2018) Li, Y.; Fan, J.; Wang, Y.; and Tan, K. 2018. Influence Maximization on Social Graphs: A Survey. IEEE TKDE, 30(10): 1852–1872.
  • Lieberman, Hauert, and Nowak (2005) Lieberman, E.; Hauert, C.; and Nowak, M. A. 2005. Evolutionary dynamics on graphs. Nature, 433(7023): 312–316.
  • Liggett (1985) Liggett, T. 1985. Interacting Particle Systems. Classics in mathematics. Springer New York. ISBN 9783540960690.
  • McAvoy and Allen (2021) McAvoy, A.; and Allen, B. 2021. Fixation probabilities in evolutionary dynamics under weak selection. Journal of Mathematical Biology, 82(3): 1–41.
  • Minoux (1978) Minoux, M. 1978. Accelerated greedy algorithms for maximizing submodular set functions. In Stoer, J., ed., Optimization Techniques, 234–243. Berlin, Heidelberg: Springer Berlin Heidelberg. ISBN 978-3-540-35890-9.
  • Monk, Green, and Paulin (2014) Monk, T.; Green, P.; and Paulin, M. 2014. Martingales and fixation probabilities of evolutionary graphs. Proc. R. Soc. A Math. Phys. Eng. Sci., 470(2165): 20130730.
  • Moran (1958) Moran, P. A. P. 1958. Random processes in genetics. Mathematical Proceedings of the Cambridge Philosophical Society, 54(1): 60–71.
  • Nemhauser, Wolsey, and Fisher (1978) Nemhauser, G. L.; Wolsey, L. A.; and Fisher, M. L. 1978. An Analysis of Approximations for Maximizing Submodular Set Functions–I. Math. Program., 14(1): 265–294.
  • Nowak (2006) Nowak, M. A. 2006. Evolutionary dynamics: exploring the equations of life. Cambridge, Massachusetts: Belknap Press of Harvard University Press. ISBN 0674023382 (alk. paper).
  • Pavlogiannis et al. (2018) Pavlogiannis, A.; Tkadlec, J.; Chatterjee, K.; and Nowak, M. A. 2018. Construction of arbitrarily strong amplifiers of natural selection using evolutionary graph theory. Communications Biology, 1(1): 71.
  • Talamali et al. (2021) Talamali, M. S.; Saha, A.; Marshall, J. A. R.; and Reina, A. 2021. When less is more: Robot swarms adapt better to changes with constrained communication. Science Robotics, 6(56): eabf1416.
  • Tang, Shi, and Xiao (2015) Tang, Y.; Shi, Y.; and Xiao, X. 2015. Influence Maximization in Near-Linear Time: A Martingale Approach. In SIGMOD, 1539–1554.
  • Tang, Xiao, and Shi (2014) Tang, Y.; Xiao, X.; and Shi, Y. 2014. Influence Maximization: Near-optimal Time Complexity Meets Practical Efficiency. In SIGMOD.
  • Tkadlec et al. (2021) Tkadlec, J.; Pavlogiannis, A.; Chatterjee, K.; and Nowak, M. A. 2021. Fast and strong amplifiers of natural selection. Nature Communications, 12(1): 4009.
  • Zhang et al. (2020) Zhang, K.; Zhou, J.; Tao, D.; Karras, P.; Li, Q.; and Xiong, H. 2020. Geodemographic Influence Maximization, 2764–2774. New York, NY, USA: Association for Computing Machinery. ISBN 9781450379984.

Appendix A Appendix

See 2

Proof.

Fix pairs and . The key property is that, for any two configurations and any node , we have : Indeed, the inequality is strict if and only if . As a consequence, Lemma 5 from Díaz et al. (2016) applies and yields a desired coupling between the processes with parameters and . ∎

See 3

Proof.

Due to Lemma 2, it suffices to prove the statement for when is a singleton set. Our proof is by a stochastic domination argument. In particular, we construct a simple Markov chain and a coupling between and the Moran process on from such that the probability of a random walk starting from a particular start state of has at least probability of getting absorbed in a final state .

Let , i.e., equals 2 plus the number of neighbors of . consists of the following set of states

The transition probability function is as follows, for constants that depend on but not on .

while for every state , the remaining probability mass is added as a self-loop probability .

We now sketch the coupling between and the positional Moran process on . The coupling guarantees the following correspondence between the states of and the configuration of the Moran process.

  1. For the state , we have .

  2. For the state , we have (i) , and (ii) for every , we have .

  3. For each state , with , we have (i) , (ii) for every , we have , and (iii) .

We argue that the transition probabilities preserve the above correspondence.

  1. While , the probability for a resident neighbor of to reproduce and place an offspring on is bounded by . On the other hand, the probability that reproduces in successive rounds until it places a mutant offspring to each of its neighbors can be exponentially small in but independent of , as, since , the fitness of is at least a fraction of the total population fitness regardless of the remaining nodes. It follows that, from a configuration with , the probability that turns resident before every neighbor of turns mutant is upper-bounded by , for some constant that depends on but not on .

  2. Given any configuration with , the probability that a resident reproduces is at most . Similarly, the probability that a mutant reproduces and replaces a resident is at least . Thus, the probability that the latter event happens before the former event is lower-bounded by some constant that depends on but not on .

Thus, we have a coupling in which the probability that a random walk in starting at gets absorbed in is at least as large as . It is straightforward to verify that the probability of the former event tends to as . Thus , as desired. ∎

See 1

Proof.

Following Lemma 4, the hardness for follows directly by reducing the question of whether a graph has a vertex cover of of size to the problem of whether can achieve fixation probability of at least . For the hardness of FM, recall that for any , the function is continuous on . A crude but slightly more detailed analysis in the proof of Lemma 4 shows that if we have an edge with , then the fixation probability from is

In turn, if is not a vertex cover of then

Due to the continuity of , there exists a large enough such that if has a vertex cover of size then . On the other hand, due to the monotonicity (Lemma 2), if has no vertex cover of size , then we have

Thus for such a fitness advantage , we can achieve fixation probability iff has a vertex cover of size . ∎

See 5

Proof.

We follow the proof strategy of Theorem 11 from Díaz et al. (2014). Given a set of nodes occupied by mutants, consider a potential function defined by , where is the number of edges incident to . Note that , since and for each we have . Moreover, given a configuration at time-point , let be a random variable that measures the increase of the potential function in a single step. We make two claims:

  1. : Let be a set of edges such that is occupied by a mutant and by a resident, and let be the total fitness of the population. In a single step, the set of nodes occupied by mutants changes either by a mutant replacing a resident neighbor, or by a resident replacing a mutant neighbor. Thus

    where the last inequality holds since the fitness of any mutant is at least as large as that of any neighboring resident.

  2. If then : If the population is not homogeneous then there exists at least one edge with and . With probability the agent at is selected for reproduction and with probability its offspring migrates to , which changes the potential function by .

By Item 1, the potential function gives rise to a sub-martingale. Moreover, the function is bounded by and by Item 2, we have . Thus, the standard martingale machinery of drift analysis applies Kötzing and Krejca (2019). Namely, the rescaled function satisfies the conditions of the upper additive drift theorem (with initial value at most and step-wise drift at least ). The expected time until termination is thus at most

See 7

Proof.

Given a set , we write to indicate that (and otherwise). Given any configuration , we write to indicate that (and otherwise). We also define a function and we let

(15)

be its expected change in a single step of the Moran process with fitness advantage at nodes . Let . Then we can write

(16)

Finally, the derivative of at is

(17)

where in the last equality we used that and that the second sum vanishes, since all the terms that involve any fixed sum up to (due to Eq. 11).

Now consider the neutral positional Moran process () starting with for . For any and , consider an event defined as “at time we have and ” and let be the total expected time spent with being a mutant and not. Then by (McAvoy and Allen, 2021, Theorem 1) we have