Designing communication and computer networks are complex processes in which careful trade-offs have to be made with respect to performance, resiliency/security and cost investments. For instance, if a host in a computer network wants to route traffic to multiple other hosts, it could directly connect to those other hosts, in this way increasing its expenses in installing and maintaining these connections and at the same time also becoming more susceptible to viruses from those other hosts. In return, it would obtain a better and faster performance with minimum delays, compared to when it would have used intermediate hosts as relays. Although in this example, both installation costs and risk to viruses are increasing, they are linearly independent and they do not necessarily optimize together. Indeed, reducing the number of direct connections would reduce the cost and the host would be less vulnerable to viruses. However, even when being connected to a few high-degree nodes with direct connections, the host would still be seriously imposed to a virus.
In practice, hosts often are autonomous, act independently and do not coordinate as in P2P networks , peering relations between Autonomous Systems , overlay networks , wireless [5, 6, 7] and mobile  networks, resource sharing in VoIP networks , social networks [10, 11] or the Internet . Their aim is to optimize their own utility functions, which are not necessarily in accordance to the global optimum. To study global network formation under autonomous actors, the network formation game (NFG) framework  has been proposed. However, resilience and notably virus protection have not been taken into account in that NFG context, even though their importance is undisputed. In this paper, we therefore take the NFG framework one step further by including performance and virus protection as key ingredients. Virus propagation will be modeled by the Susceptible-Infected-Susceptible (SIS) model  and we will evaluate the effect of uncoordinated autonomous hosts versus the optimal network topology via standard game-theoretic concepts, such as Nash Equilibria and the Prices of Anarchy and Stability.
Our network formation game is called the Virus Spread-Performance-Cost (VSPC) game. Each node (i.e., autonomous player) attempts to minimize both the cost and infection probability, while still being able to route traffic to all the other nodes in a small number of hops. When the hopcount performance metric is irrelevant, the game is driven by the cost and virus objectives; a scenario we studied in . That particular scenario resulted in sparse graphs, which may not always represent real-world networks, but it helped to understand the process of virus spread better. In this paper, we generalize those results by also including the hopcount performance metric. The probability of the node being infected and the hopcounts to the other nodes change in a different direction, for example adding a link reduces the former, but increases the latter. Therefore, there is a tradeoff in the number of added links and how these new links are best added. Moreover, the two metrics are linearly independent and closed-form expressions do not exist, which makes the problem complex. Finally, the inclusion of the hopcount allows us to better capture realistic networks. In particular, our main contributions are the following:
We provide a complete characterization of the various relevant parameter settings and their impact on the formation of the topologies.
We show that depending on the input, the Nash Equilibria may vary from tree graphs, via graphs of different diameters, to complete graphs.
We demonstrate, both via theory and simulations, that the Price of Anarchy (PoA) is small in most of the cases, which implies that (near-)optimal topologies can be formed in a decentralized non-cooperative manner. We will also identify for which scenarios the PoA may be high. In those cases a central point of control would be desirable to limit/steer the players’ decisions.
This paper is organized as follows: The SIS-virus spread model and the network-formation game model are introduced in Section II. The Virus Spread-Performance-Cost (VSPC) game formation is analyzed in Section III. Related work on game formation and protection against viruses is discussed in Section IV. The conclusion and directions for future work are provided in Section V.
Ii Models and problem statements
Ii-a Virus-spread model
The spread of viruses in communication and computer networks can be described, using virus-spread epidemic models [14, 15, 16]. We consider the Susceptible-Infected-Susceptible (SIS) NIMFA model [14, 17],
where is the number of network nodes and is the probability of node being infected at time , for all . If a link is present between nodes and , then , otherwise . In (1), a host with a virus can infect its direct healthy neighbors with rate , while an infected host can be cured at rate , after which the node becomes healthy, but susceptible again to the virus. The probability depends on the probabilities of the neighbors of node and there is no trivial closed form expression for . The model incorporates the network topology and is thus more realistic than the related population dynamic models. The model relies on the network topology, which makes it more realistic than the related population dynamic models. The goodness of the model has been evaluated in . The probability of a node being infected in the metastable regime, denoted by , where and , follows from (1) as ,
where is called the effective infection rate. The epidemic threshold is defined as a value of , such that if , and otherwise for all . The value of depends of the values of all for all the neighbors of , so the network topology and the interconnectivity have impacts on s.
Ii-B Game-formation model
In our network formation game, each player (a node in the network) aims to minimize its own cost function , and the social cost is defined as . Specifically, the optimal social cost is the smallest social cost over all possible connected topologies. We look for the existence, uniqueness, and characterization of (pure) Nash Equilibria111A Nash Equilibrium is the state of the players’ network strategies, where none of the players can reduce its cost by unilaterally changing its strategy.. The Price of Anarchy (PoA) and the Price of Stability (PoS) are defined as the ratio of social cost in the worst-case Nash Equilibrium (the one with highest social cost) and the optimal social cost, and the ratio of the social cost in the best-case Nash Equilibrium (the one with lowest social cost) and the optimal social cost, respectively:
PoA is an efficiency measure, illustrating how bad selfish playing is, in comparison to the global optimum. PoS, on the other hand, reflects the best possible performance without coordination in comparison to the global optimum. The network about to be designed, is empty and every node in the network is a player. We assume the cost of building one (communication) link between two nodes is fixed. Every player can install a link from itself to another node . Installing a link between and means that both and can utilize it, but only one pays for the cost, like often assumed in NFG models [12, 19, 4]. Several examples fit this scenario: (i) a friend request is initiated by one node in a social network, but both read the posts from one another; (ii) a new road connecting two cities is built by one city in a road network, but both utilize it; and (iii) in a hand-shake protocol in a computer network one node initiates a connection used by two nodes.
We consider a Virus Spread-Performance-Cost (VSPC) network formation game, where player aims to reduce its cost and the probability of being infected, but concurrently also wants to improve its performance by shortening the hopcounts of the shortest paths to all the other nodes . The cost function of player that unites these objectives is given by:
Function involves the cost of installing all the links from node , weighted by a coefficient . The hopcounts are weighted by . Opposing goals meet in this game: the more links are installed, the shorter the paths, but the higher the probability of being infected and the higher the cost.
The social cost for the whole network is a weighted sum over all nodes
where denotes the number of links.
Iii Virus spread-performance-cost (VSPC) game
Iii-a Optimal social cost, Nash Equilibria and the PoA for
In order to understand the effect of the virus protection, we start by setting to an infinitely small number (approaching zero222The case of is either trivial or debatable. By neglecting the hopcounts, the optimal topology would be the (non-realistic) empty graph with no links (cost) and no epidemic to be propagated. Moreover, infinite hopcounts will be multiplied by which is undefined.). As a result, the hopcounts are of no influence anymore, while network connectivity is still guaranteed (the hopcount between two disconnected nodes is assumed to be infinity). Lemma 1 limits the possible Nash Equilibria.
The probability of node being infected in the metastable state in network does not exceed the probability of node being infected in the metastable state in network obtained by adding a link to .
The newly added link is between nodes and . We make use of the canonical infinite form ,
After the addition of link , the expression (6) for has all the terms the same as in , except the following differences: ; and the presence of the adjacency entry in the canonical representation. The last statement implies that its contribution is a part that is the same as in until it “reaches” nodes or , where the expression (at a certain depth of the canonical form) is:
where and are the degrees of and in , while is the remaining part in the canonical form. In (7), the term is positive and increases with and . increases with and as it is also an infinite canonical form with any of these two variables being in the numerator or in the denominator with a negative sign in front, in the same way as explained in the lines above - repeating infinitely many times. Therefore, the whole term in (7) increases, which implies that also increases for each node . ∎
We start by looking into the possible Nash Equilibria.
If a Nash Equilibrium is reached, then the constructed graph is a tree.
If is connected and each node can reach every other node, then changing the strategy of node from the current one to investing in an extra link, will increase both its cost (by , scaled by ) and (by Lemma 1). Hence, unilaterally investing in an extra link is not beneficial for a node.
We now assume that is not a tree. Then, there is at least one cycle in this graph. If a node in that cycle changes its strategy from investing in a link in that cycle to not investing, the cost is decreased by (weighted by ) and all the other nodes in the graph are still reachable from . Moreover, by not investing in that link, node decreases its probability of being infected in the metastable state, according to Lemma 1. Hence, by unilaterally changing its strategy, node decreases its cost utility , which is in contradiction with a Nash Equilibrium. ∎
A Nash Equilibrium is achieved for both the star graph and the path graph, but not all trees are Nash Equilibria.
Let us consider a star graph, where all the links are installed by the root node as shown in Fig. (a)a. (A link is installed and paid for by the node marked with .) The root node cannot unilaterally decrease its cost, because cutting at least one of its installed links would disconnect it, while installing a link from a leaf node would increase both and (Lemma 1). Hence, the star graph is a Nash Equilibrium.
Let us now assume that a path graph (Fig. (b)b) is constructed, such that nodes invest in exactly one link and one of the leaves does not invest in installing a link. Similarly as for a star graph, none of the nodes can unilaterally decrease their cost by just installing extra links or cutting some of them. A ”re-wiring”333”Re-wiring” is a process of removing a link to node initiated by node and establishing a new link to another node . The degree of node does not change, while the degrees of and are decreased and increased, respectively. from one of the nodes by re-directing its installed links to another node may be in order. In such a case, if node “re-wires” its installed link to another node, then would not decrease. 1) If it is installed to one of the leaves, such that the graph is connected, we end up with an isomorphic graph, where the position of is the same as in the initial graph, so stays the same. 2) If ”re-wires” to one of the other nodes (w.l.o.g., ) as visualized in Fig. (e)e, would have the same degree, but its “new neighbor” would have a degree instead of . The degree of increases by to and the degree of decreases by to (node will become terminal and ”far” from ), while all the other degrees remain the same. Moreover, would be equally close to any of the nodes “behind” , closer to the nodes “at the end” and equally close to the nodes in the set , but just in a reverse order. Based on the canonical infinite form (6), would increase444 in (6) would have bigger values by having nodes with ”bigger degrees” as as close as possible (i.e. in fewer hops) to the node.. Therefore, the path graph is also a Nash Equilibrium.
There are also other trees that are Nash Equilibria (e.g., given in Fig. (c)c). Moreover, there are values of such that worst- and best-case Nash Equilibria are achieved for trees different from star and path graphs. For , tree is the best-case Nash Equilibrium and has optimal social cost.
However, not all the trees are Nash Equilibria. For example, the tree given in Fig. (d)d. Here, whomever pays for the “central” link between and , can reduce its cost utility by “re-wiring” to or . ∎
We proceed by characterizing the worst- and best-case Nash Equilibria.
For sufficiently high effective infection rate , the optimal social cost and the best-case Nash Equilibrium are achieved by the star graph , while the worst-case Nash Equilibrium is achieved for the path graph ,
According to Theorem 1, in a Nash Equilibrium the graph is a tree, hence it has links. In a general case, from a tree in which there are two nodes and , connected to one another, for which and (i.e. is a leaf), by breaking the connection between and and connecting to another leaf instead, we have: the degree of is (remains the same); the degree of node becomes (decreased by one); and the degree of is (increased by one). The process can be repeated until there exists a node of degree at least in the tree. At the end, we end up with a tree with no degree bigger than and this is a path . The social cost is increased in each step [1, Lemma 2]. In this way, the process converges to a path .
In a very similar (but reverse) process, starting from any tree , we can decrease at each step, ending up with a star with a maximum in the final step. ∎
However, what would be the optimal social cost, and the worst- and best-case Nash Equilibria highly depends on the effective infection rate .
For low values of the effective infection rate , above but sufficiently close to the epidemic threshold , the optimal social cost and the best-case Nash Equilibrium are achieved by the path graph , while the worst-case Nash Equilibrium is achieved by the star graph ,
We consider a spectral approach  and denote the infection probability of all nodes in the metastable state. The probabilities of a node in the graph being infected are non-zero and if , where
is the largest eigenvalue of the adjacency matrix in the graph. For , .
Lovász and Pelikán  ordered all the trees with nodes by the largest eigenvalues of the adjacency matrices. It turns out that, the path and star are the trees with the minimum and maximum largest eigenvalues, respectively.
For values , it holds that , where is any tree different from , therefore is the largest.
For values , we have , where is any tree different from , hence is the smallest. ∎
Theorems 2 and 3 show opposite behavior depending on whether the value is in the high or low regime, although both revolve around the path and star graphs. For in the intermediate regime, different trees may give the best-/worst-case Nash Equilibrium.
For both high and low effective infection rate , .
For sufficiently high effective infection rate , in the virus spread-cost game formation,
The proof is provided in . ∎
The exact value of the PoA is given in Fig. 2 by making use of Corollary 1. It is highest () for small , above the epidemic threshold and it further sharply decreases reaching for a unique Nash Equilibrium. For higher , the PoA increases towards its maximum around and then it slowly decreases approaching .
We have observed that the equilibria tree topology in which a virus thrives is not always a star (i.e., the tree with the smallest diameter), but that it may differ with the virus infection rate. For most of the values (except maybe small ), a small value for the Price of Anarchy (PoA) means that a topology close to optimal can be obtained in a decentralized manner, even when the individual players play selfishly.
Iii-B Optimal social cost, Nash Equilibria and the PoA for
We start by analyzing the social cost (5). Node is one hop away from its neighboring nodes, while it is at least hops away from the other nodes, hence . Using this, for large enough when can be approximated555In fact, the sum can be lower bounded [14, p. 10] by , which is meaningful for . by using truncation of Maclaurin seria [17, Lemma 1], the social cost in (5) is lower bounded as
The following bound is due to Cioabă [22, Theorem 9],
where the equality holds for regular graphs and the star graph. Based on this, , and , we obtain
Equality in (9) is achieved only for the star , where and , or for the complete graph (where ). (The equality for other regular graphs is ruled out because of the inequality in (9).) Using (9) into (8) yields
Let us consider two regimes:
If , then the bound in (10) is an increasing function in , hence the optimal social cost is achieved for . The bound in (10) is tight for such , because the bounds in (9) and (8) become equalities for and any graph with a diameter at most two, respectively. Hence, and equality is achieved only for the star graph .
If , then the bound in (10) increases for and decreases for . Hence, the optimal social cost is achieved in one of two boundary cases: and . For , similarly as in 1), we obtain that the only possibility is the star graph , while for it is the complete graph . Finally, .
It remains to compare and : and . Hence,
If , then and the optimal social cost is achieved for the complete graph . If , then and the optimal social cost is achieved for the star graph . The last also covers case 1), because .
Now, for the optimal social cost, Theorem 4 follows.
For sufficiently high , the optimal social cost is achieved for the star if , and for the complete graph , otherwise.
We proceed with characterization of the Nash Equilibria and the Price of Anarchy for sufficiently high . In the VSPC game, Nash Equilibria topologies can be complex, while the star and the complete graph can appear as extreme cases:
The complete graph is a Nash Equilibrium, if and only if . Since new links cannot be added, changing the strategy for a node means deleting of its links (). The corresponding change would increase the cost of by . Hence, node has no interest to deviate from its current strategy. On the other hand, if and node changes its strategy by cutting links (all except one - to keep its connectivity), the change in is equal to , which will reduce its cost.
The star graph is a Nash Equilibrium, if and only if . The root node cannot delete a link, because this would make its cost infinity. If is a leaf, for some , changing its strategy means: (i) adding links, then the hopcounts to these nodes are reduced from to , hence the contribution from the hopcounts is changed by ; or (ii) deleting the link installed by him (if any) and installing links, where . In (ii), the hopcount to the root node is increased from to , the hopcount to links is decreased from 2 to 1, and the hopcounts to the other nodes are increased from to . The change in the sum of hopcounts is: , hence the change of the hopcount is again at least . Thus, the change in is at least . On the other hand, if , the change in by adding links from one leaf to all the other leaves in , is , i.e. it is not a Nash Equilibrium.
The above two points resolves the conditions for two specific graphs, but they do not cover all the possibilities for the Nash Equilibria and the Price of Anarchy, which may vary on different intervals and a case analysis, as provided in the following, is required. We will consider the case and the case .
Now, . A Nash Equilibrium is achieved only for graphs with a diameter at most - an argument used in the later points (b) and (c). The proof is by contradiction. Let us assume node is at least hops away from another node. Clearly, and if installs a link from to , the difference in is at least . Hence, reduces its cost and the graph is not a Nash Equilibrium. We consider three sub-intervals (a), (b) and (c):
(a) If , adding a link from will change by at least . Therefore, the complete graph is the only Nash Equilibrium. Because, for and, according to Theorem 4, it also has optimal social cost. Finally, .
(b) If and we assume, by contradiction, that there is a Nash Equilibrium different from , we have the following:
If there is a link in the graph, installed by node such that its deletion increases the sum of hopcounts from by only , then is increased by: . On the other hand, adding a link would change to: . The last two inequalities imply, , which is a contradiction. Hence, there is no other Nash Equilibrium different from and .
If deleting any of the links installed by would increase the sum of hopcounts by at least ; by link deletion, the difference in is at least and we have . We proceed by considering the properties of the possible Nash Equilibria in particular sub-intervals: for . By link addition, the difference in is and a necessary condition for a Nash Equilibrium is . On the other hand, , hence . Therefore, we have less than nodes that are on a distance from a node . Each of these nodes is directly connected by less than nodes different from . Hence, there less than nodes that are at most hops from , hence at least one node that is more than hops away from , a contradiction to the general claim (before (a))! Hence, is the only Nash Equilibrium and .
(c) If , then is a Nash Equilibrium. Graphs that are of diameter at most are also candidates for a Nash Equilibrium.
Because the diameter of the graph is not bigger than , (8) becomes an equality for sufficiently large . Applying the condition of (c) leads to
due to the fact that (equivalent to ). Equality holds (only) in the last line of (11) if or for all (e.g., the ring or the path graphs), otherwise a strict inequality in the second part also holds. Finally, knowing that the optimal social cost is attained by the complete graph and the condition inequality condition in (c) for : for each ; because . This bound is approached, for instance, when and are large and bigger than : is the social optimum and is the worst-case Nash Equilibrium and the bounds in (11) and the inequality for PoA are closely approached. If , is a Nash Equilibrium and , otherwise .
We first consider the links, whose deletion leaves the graph connected. For any node , we focus on the links installed by . Let be one such link and the number of all nodes that use as a link for the shortest paths from to is . According to Schoone et al. [23, Theorem 2.1., case ], all the distances from to the other nodes are increased by at most , where is the diameter in the original graph. In a Nash Equilibrium, for any possible value of , i.e. we obtain . Hence, and then the number of such links to node is not bigger than . Taking into account all possible nodes, the number of links whose deletion does not disconnect the graph is not bigger than . On the other hand there are at most links such that a removal of any of those links disconnects the graph. Indeed, a connected graph has a spanning tree and a link removed from disconnects the graph, while a removal of a link that is not in leaves the graph connected. Therefore,
If two nodes and are hops apart from each other, adding a link from to would reduce the hopcounts from to all the nodes in the “second half” along the previous path to by at least half of their lengths, by for even or by for odd. Hence, the total reduction in the sum of shortest paths from is or for even or odd, respectively. Assuming a Nash Equilibrium and is a starting node on the diameter, considering the change in cost , the following inequality would hold for any : . Using , and the absolute maximum for being , we arrive at
Each node has at least one neighbor and all the others are no more than hops apart, hence:
. Applying the arithmetic-harmonic mean inequality leads to. We proceed by upper bounding in (8),
Simulation results of the heuristic algorithm for the obtained networks in a Nash Equilibrium. The three regimes big, moderate and smallare represented with values , and , respectively. The number of nodes is .
We distinguish two sub-cases, (a) and (b):
(a) If , then the optimal social cost (and a Nash Equilibrium) is achieved for the star graph , hence PoS=1. Now, using (15) for PoA,
and applying (12),
(b) If , then the optimal social cost is achieved for the complete graph , and using (15), . Now,
is infinitesimally small. Then is small and PoA has a value close to .
Based on these results, we present Theorem 5.
For sufficiently high in the VSPC game, the PoA depends on the parameters , and ,
if , then PoS=1 and if
is not small, then