## 1 Introduction

We study a network formation game where strategic
agents (vertices on a graph)
receive both benefits and costs from forming connections
to other agents. While
various benefit functions exist in the
literature [2, 6],
we focus on the *reachability network benefit*.
Here, the benefit of an agent is
the size of her connected component in the collectively
formed graph. This models
settings where reachability (rather than centrality) motivates
joining the network, e.g.
when transmitting packets over technological networks
such as the Internet.

Most previous works feature a direct edge cost
for forming a
link. Goyal et al. [8] depart from this notion by
studying a game
where forming links introduces an additional
*indirect* cost by exposing agents
to contagious network shocks. These indirect
costs can model scenarios such
as virus spread through technological or biological
networks.

Our work continues this investigation of direct and indirect connection costs. To model the indirect cost we assume that, after network formation, an adversary attacks a single vertex uniformly at random. The attack then kills the vertex and spreads through the network via the independent cascade model according to parameter [9]. This random attack and probabilistic spread captures the epidemiological quality of virus spread in both biological and technological networks.

At a high level, our work is most closely related to two previous works. Bala and Goyal [2] study a reachability network game without attacks and show a sharp characterization of equilibrium networks: every tree and the empty network can form in equilibria. Goyal et al. [8] study a reachability network formation game where an adversary inspects the formed network and then deliberately attacks a single vertex in the network. The attack then spreads deterministically to neighboring vertices according to a known rule, while agents may immunize against the attack for a fixed cost. Our game is most similar to the latter setting under a random adversary and high immunization cost. However, in our setting attacks spread probabilistically (through independent cascades) rather than deterministically. This yields an arguably more realistic model of infection spread but incurs additional complexity: computing the expected connectivity benefit of an agent in a given network is now #P-complete [12].

Goyal et al. [8] show that while more diverse equilibrium networks, including ones with multiple cycles, can emerge in addition to trees and the empty graph, the equilibrium networks with agents will have at most edges; less than twice the number of edges that can form in the equilibria of the attack-free game. Furthermore, they show that the social welfare is at least in non-trivial equilibrium networks. Asymptotically, this is the maximum welfare possible which is achieved in any nonempty equilibrium of the attack-free game. In the regime where the cost of immunization is high, the game of Goyal et al. [8] only admits disconnected and fragmented equilibrium networks due to deterministic spread of the attack, and the social welfare of the resulting networks may be as low as .

Our Results and Techniques In our game, computing utilities or even verifying network equilibrium is computationally hard. We circumvent this difficulty by proving structural properties for equilibrium networks. First, we provide an upper bound on the edge density in equilibria.

###### Theorem 1 (Statement of Theorem 3).

Any equilibrium network on vertices has edges.

For constant this upper bound is tight up to a logarithmic factor. The possibility of over-building therefore differentiates our game from those of Bala and Goyal [2] and Goyal et al. [8], but the extent of over-building is limited.

To prove Theorem 1, we first show
that any equilibrium network
with more than edges
contains an induced subgraph
with large minimum cut size. We then show that
if a network has large minimum cut
size, in *every* attack (with high probability),
either almost all vertices in the network
will die or almost all vertices in the network will survive.
As a result, any vertex in the induced
subgraph can beneficially deviate by dropping an edge.
Together, these observations allows us to prove the claimed
edge density bound.

Next, we show that any equilibrium network that is nontrivial (i.e. contains at least one edge) also contains a large connected component. Moreover, as long as the network is not too dense, it achieves a constant approximation to the best welfare possible of the attack-free game.

###### Theorem 2 (Informal Statement of Theorems 5 and 6).

Any non-trivial equilibrium network over vertices contains a connected component of size at least . Furthermore, if the number of edges in the network is , then the social welfare is .

To prove Theorem 2, we first show that any agent in a small connected component can increase her connectivity benefits by purchasing an edge to a larger component without significantly increasing her attack risk. This implies the existence of a large connected component. We then use the large component to argue that when the equilibrium network is sparse, the surviving network post-attack still contains a large connected component. This guarantees large social welfare.

While Goyal et al. [8] show robustness of the structural properties of the original reachability game of Bala and Goyal [2] to a variation with attack, deterministic spread and the option of immunization for players, we show robustness in another variant that involves a cascading attack but disallows immunization. However, on the technical front, the tools that we use to prove these robustness results are very different from the analysis of both of these previous games.

## 2 Model

We start by formalizing our model and borrow most of our
notation and terminology from Goyal et al. [8]. We assume the vertices of a graph
(network) correspond to individual players. Each player has the choice
to purchase edges to other players at a *fixed* cost of per edge.
Throughout we assume that is a constant independent of .
Furthermore, we use the term *high probability* to refer to probability
at least henceforth.

A (pure) *strategy* for player
consists of a subset of players to whom player purchased an edge.
We assume that edge purchases are unilateral i.e. players do not need
approval to purchase an edge to another player
but that the connectivity benefits and risks are bilateral.^{1}^{1}1As
an example of a scenario where the consequences are bilateral
even though the link formation is unilateral,
consider the spread of a disease in a social network where the links
are formed as a result of physical proximity of individuals.
The social benefits and potential risks of a contagious disease are
bilateral in this case although the link formation as a result of proximity
is unilateral. We leave the study of the bilateral edge formation
for future work.

Let denote the strategy profile for all the
players. Fixing , the set of edges purchased by all the
players induces an undirected graph. We denote a *game*
*graph* as a graph , where is the undirected
graph induced by the edge purchases of all players.

Fixing a game graph , the adversary selects a *single*
vertex uniformly at random to start the attack.
The attack kills and then spreads according to the
independent cascade model
with probability [9].^{2}^{2}2Throughout we assume
that is a constant independent of the number of players . We discuss
the regime in which decreases as the number of players increases
in Section 4.1.
In the independent cascade model, in the first round, the attack spreads independently
killing each of the neighbors of the initially attacked vertex with probability .
In the next round, the spread continues from all the neighbors of that were killed in the
previous round.
The spread stops when no new vertex was killed in the last round or when all the vertices are killed.

The adversary’s attack can be alternatively described as follows. Fixing a game graph ,
let denote the *random* graph
obtained by retaining each edge of independently with probability .
The adversary picks a vertex uniformly at random to start the attack.
The attack kills and all the vertices in the connected component of
that contains .

Let denote the *expected* size of the connected component of player
post-attack to a vertex and we define to be . Then the expected utility (utility for short) of
player in strategy profile denoted by is precisely

We refer to the sum of utilities of all the
players playing a strategy profile as the *social welfare* of .

Wang et al. [12] show that computing the exact spread of the attack in the independent cascade model is #P-complete in general. This implies that, given a strategy profile , computing the expected size of the connected component of all vertices (and hence the expected utility of all vertices) is #P-complete. However, an approximation of these quantities can be obtained by Monte Carlo simulation.

We model each of the players as
strategic agents who deterministically choose which edges to purchase.
A strategy profile is a
*pure strategy Nash equilibrium* if,
for any player , fixing the behavior of the other players to be
, the expected utility for , , cannot strictly increase when playing any strategy
over . We focus our attention to pure strategy Nash equilibrium (or equilibrium) in this work.
Since computing the expected utilities in our game is #P-complete, even verifying
that a strategy profile is an equilibrium is #P-complete. Hence as our main contribution,
we prove structural properties for the equilibrium networks regardless of this computational barrier.

### 2.1 Related Work

There are two lines of work closely related to ours. First, Bala and Goyal [2] study the attack-free version of our game. They show that equilibrium networks are either trees or the empty network. Also since there is no attack, the social welfare in nonempty equilibrium networks is asymptotically .

Second, Goyal et al. [8] study a network formation game where players in addition to having the option of
purchasing edges can also purchase immunization from the attack. Since we do not study the effect of immunization
purchases in our game, our game corresponds to the regime of parameters in their game
where the cost of immunization is so high that no vertex would purchase immunization in equilibria.
Moreover, they study several different adversarial attack models and our attack model coincides
with their *random attack adversary*.
The main difference between our work and theirs is that they assume the attack spreads
deterministically while we assume the attack spreads according to the independent cascade
model [9]. In many real world scenarios e.g. the spread of
contagious disease over the network of people, the spread is *not deterministic*. Hence our
work can be seen as a first attempt to make the model of Goyal et al. [8] closer to real world
applications. However, the change in the spread of attack comes with a significant increase in the
complexity of the game as even computing the utilities of the players in our game is #P-complete.
While Friedrich et al. [7] have shown that best responses for players can be computed in polynomial
time under various attack models, the question of whether best response dynamics
converges to an equilibrium network is open in the model of
Goyal et al. [8].

Similar to Goyal et al. [8] we show that diverse equilibrium networks can form in our game. While they show that all equilibrium networks over players have at most edges, we show that the number of edges in any equilibrium network is at most and this bound is tight up to a logarithmic factor. Furthermore, Goyal et al. [8] show that the social welfare is asymptotically in non-trivial equilibrium networks. Their definition of non-trivial networks requires the network to have at least one immunized vertex and one edge. In the regime where the cost of immunization is high, the game of Goyal et al. [8] only admits disconnected and fragmented equilibrium networks due to the deterministic spread of the attack. Such networks (even excluding the empty graph) can have social welfare as low as . We show that any low density equilibrium network of our game enjoys a social welfare of as long as the network contains at least one edge.

Kliemann [11] introduced a network formation game with reachability benefits and an attack on the formed network that destroys exactly one link with no further spread. Their equilibrium networks are sparse and also admit high social welfare as removing an edge can create at most two connected components. Kliemann et al. [10] extend this to allow attacks on vertices while focusing on swapstable equilibria.

Blume et al. [3] introduce a network formation game with bilateral edge formation. They assume both edge and link failures can happen simultaneously but independent of the failures so far in the network. These differences make it hard to directly compare the two models. They show a tension between optimal and stable networks and exploring such properties in depth in our model is an interesting direction.

Finally, network formation games, with a variety of different connectivity benefit models, have been studied extensively in computer science see e.g. [2, 3, 11]. We refer the reader to the related work section of Goyal et al. [8] for a comprehensive summary of other related work especially on the topic of optimal security choices for networks.

## 3 Examples of Equilibrium Networks

In this section we show that a diverse set of topologies can emerge in the equilibrium of our game. Similar to the models of Bala and Goyal [2] and Goyal et al. [8] the empty graph can form in the equilibrium of our game when . Moreover, similar to both models, trees can form in equilibria (See the left panel of Figure 1). Finally, while Goyal et al. [8] show that in the regime of their game where the cost of immunization is high (so no vertex would immunize) no connected network can form in equilibria due to the deterministic spread of the attack, we show that connected networks indeed can form in the equilibria of our game (See Figure 1).

We remark that pure strategy equilibria exist in all parameter regimes of our game. When , the empty network can form in equilibria for all . When a cycle or two disconnected hub-spoke structure of size can form in equilibria depending on whether is far or close to 1 ( and , respectively).

Examples in Figure 1 show that denser networks can form in equilibria compared to the model of Bala and Goyal [2] and the high immunization cost regime of the model of Goyal et al. [8]. So it is natural to ask how dense equilibrium networks can be. We study this question in Section 4 and show an upper bound of on the density of the equilibrium networks. Since the examples in Figure 1 have edges, our upper bound is tight up to a logarithmic factor.

Moreover, while all the equilibrium networks in Figure 1 are connected, there might still exist equilibrium networks in our game that are highly disconnected. In Section 5 we show that any equilibrium network with at least one edge contains a large connected component. However, even with the guarantee of a large connected component, there might still be concerns that the equilibrium networks can become highly fragmented after the attack. In Section 5 we show that as long as the equilibrium network is not too dense, the social welfare is lower bounded by i.e. a constant fraction of the social welfare achieved in the attack-free game.

We obtain these structural results even tough we cannot compute utilities nor even verify that an equilibrium has reached due to computational barriers. We view these results as are our most significant technical contributions.

## 4 Edge Density

We now analyze the edge density of equilibrium networks.

###### Theorem 3.

Any equilibrium network on vertices has edges.

The proof of Theorem 3 is due to the following observations which we formally state and prove next. At a high level, we first show that if has large enough edge density, then contains an induced subgraph whose minimum cut size is large. We then show a large minimum cut size implies that is connected with high probability. This means that in almost all attacks that infect a vertex in , all vertices in will get infected. So a vertex in would have a beneficial deviation in the form of dropping an edge; which contradicts the assumption that was an equilibrium network. This proves that equilibrium networks cannot be too dense.

More formally,
we first show in Lemma 1 that if is *dense enough*
it contains a subgraph with a
minimum cut size, denoted by , of at least
.

###### Lemma 1.

Let be a graph on vertices. There exists a constant such that if then contains an induced subgraph with .

###### Proof.

If then is the desired graph. Otherwise, there is a cut of size less than that partitions into two graphs and . Repeat this process at and , and build a decomposition tree in this manner. Any leaf of this tree is either a singleton vertex or a graph where the minimum cut size is at least . If at least one leaf in satisfies the latter property, then we are done and this is our desired graph . We now argue that it can not be the case that all leaf vertices in are singletons.

To see this, note that there can be at most internal vertices in and each internal vertex in corresponds to removing up to edges from . Thus the decomposition process removes at most edges. On the other hand, has at least edges. It follows that not all leaves of can be singleton vertices. ∎

We then show that if is then with high probability is connected.

###### Lemma 2 (Alon [1]).

We now define a property which we call
*almost certain infection* and show
that no equilibrium network can contain an
induced subgraph satisfying this property.

###### Definition 1.

Let be a graph on vertices and
let be a subgraph of on more than one vertex.
has the *almost certain infection* property if whenever
any vertex in is attacked, then with probability at least
the attack spreads to every vertex in .

###### Lemma 3.

Let be an equilibrium network on vertices. cannot contain an induced subgraph such that satisfies the almost infection property.

###### Proof.

Consider any equilibrium graph that violates the assertion of the claim. Let be an induced subgraph of with the almost certain infection property. We first prove that contains a cycle. Assume by the way of contradiction that does not have a cycle, so is a collection of trees. Let be any leaf in . Then is incident to at most one edge in . Therefore, with probability , this edge is not in and is not connected to any other vertices in . So cannot have the almost certain infection property (Recall that we assumed is a constant independent of the number of players ). This means that contains a cycle . Let be an edge on the cycle . Assume without loss of generality that purchased the edge .

Now let be the event that an attack propagates to some vertex in after the attack. Then conditioned on , with probability at least , all vertices in die. Hence vertex in has negative utility. On the other hand, if does not occur, then the utility of remains unchanged even if we remove the edge . Thus vertex can strictly improve her utility in this case by dropping the edge . A contradiction to the fact that is an equilibrium network. ∎

We are now ready to prove Theorem 3.

Proof of Theorem 3. Assume by way of contradiction that has more than edges where is the constant in Lemma 1. Then by Lemma 1, contains a subgraph such that . Since , by Lemma 2, has the almost certain infection property. However, cannot be an equilibrium network by Lemma 3. ∎

The most interesting regime for the probability of spread is when is a constant independent of . While the upper bound in Theorem 3 holds for all , it becomes vacuous as gets small i.e. it becomes bigger than the trivial bound of when for constant . In Section 4.1 we analyze the edge density of equilibrium networks in the regime where . We show that the number of edges in any equilibrium network is bounded by in this regime. To prove the density result we utilize properties of the Galton-Watson branching process and random graph model of Erdös-Rényi, as well as tools from extremal graph theory.

### 4.1 Small Regime

In this section we focus on the regime where and prove the following upper bound on the edge density.

###### Theorem 4.

Let for some constant . Let be an equilibrium network over vertices. Then for sufficiently large , .

We prove Theorem 4 by contradiction and show that if the equilibrium graph has more than edges, there exists a beneficial deviation in the form of dropping an edge for one of the players.

In order to prove Theorem 4, we need structural results stated in Lemmas 4 and 5. First, consider an edge purchased by vertex . Purchasing this edge would not have increased the connectivity benefit of unless, after some attack, the edge is the only path connecting to (and possibly other vertices that are only reachable through ). In Lemma 4 (which we will prove later) we show that if a graph is dense enough, then there exists an edge such that many vertices should be deleted in order to make the only remaining path connecting and .

###### Lemma 4.

Let be a graph on vertices with for some . Then there exist two vertices and such that , and at least vertices need to be deleted so that the only path from to in is through the direct edge .

Second, as described in Section 2, the number of vertices that are killed in any attack is the size of the connected component in that contains the initially attacked vertex. Lemma 5 (which we will prove later) bounds the size of a randomly chosen connected component in .

###### Lemma 5.

Let be an equilibrium network over vertices with and . When and is sufficiently large, the size of the connected component of a randomly chosen vertex in is at most with probability at least .

We now give the formal proof of Theorem 4.

###### Proof of Theorem 4.

Assume by way of contradiction that is an equilibrium network with edges where . Let , we have . By Lemma 4, there exists an edge such that in order to make the only remaining path connecting and , we need to delete at least vertices. Without loss of generality assume that has purchased the edge . If in any attack at most vertices are killed, will not lose any connectivity benefit after dropping this edge but decrease her expenditure by .

Consider the size of the largest connected component in . Since , the size of the largest connected component in is at most with probability for sufficiently large constant which only depends on . When is the complete graph, corresponds to the random graph generated by the Erdös-Rényi model. In such case, the size of the largest component of is , with high probability, when [5].

If , then with probability at most , the attack kills more than vertices, in which case the connectivity benefit of can decrease by at most after dropping the edge . So the expected connectivity benefit of decreases by at most after the deviation but her expenditure also decreases by . Hence, after the deviation, the expected change in the utility of is at least which contradicts the assumption that is an equilibrium network (recall that we have assumed is a constant independent of ).

If , then by definition, . Since is also at least , by Lemma 5, with probability at least , the attack kills at most vertices, in which case the connectivity benefit of remains unchanged after dropping the edge . With probability at most , more than vertices are killed in which case the connectivity benefit of can decrease by at most . So the expected connectivity benefit of decreases by at most after the deviation but her expenditure also decreases by . Hence, after the deviation, the expected change in the utility of is at least , which contradicts the assumption that is an equilibrium network. ∎

###### Proof of Lemma 4.

Assume by way of contradiction that is the graph with smallest number of vertices such that has vertices and at least edges. Therefore, has two vertices and such that and there exists a vertex set with at most vertices, such that after deleting the only path from to in is through the edge . If has less than vertices, then we add arbitrary vertices from (but not or ) to so that has exactly vertices (we can always do so because has more than vertices).

Consider the graph where the edge and the vertices in are removed, and are not connected in this graph. Let be the connected component that contains and . By definition, is the only edge between and in .

Define two graphs and as subgraphs of induced by and . Suppose has vertices and has vertices where and . Also without loss of generality assume . We have that . On the other hand, and have at least edges in total (any edge which is not (,) is either in or ). So either has at least edges or has at least edges. Also by the property of , for any pair of vertices (or ) such that (or ) we only need to delete at most vertices so that the only path from to is through the direct edge .

We claim that there exists a graph (which is either or ) with vertices that has at least edges. Note that if has at least edges then we are done since and imply that . So suppose has less than edges. Therefore, has at least edges. Again if we are done so suppose . Consider the edges which are in but not in . These edge have at least one endpoint in , so there are at most

such edges. But this would imply has at least .

If has strictly more than vertices, it contradicts our assumption that is the smallest graph with the property stated in the lemma. If has at most vertices, then has at most

edges, which is a contradiction. ∎

Before proving Lemma 5,
let us introduce some notation. Let
be the set of vertices in with degree at least . Also let be the
set of vertices in with degree strictly less than . Hence, and correspond
to vertices with *high* and *low* degrees in , respectively.
Recall that is a random graph where each edge of is sampled independently
to be retained in with probability .
Using and , the creation of can be describe as a three step sampling process.
In the first step, edges with both endpoints in are sampled to be retained.
In the second step, edges with both endpoints in are sampled to be retained.
Finally, in the third step, edges with one endpoint in and the other endpoint in
are sampled to be retained.

In Lemma 6, we first show that with high probability the size of the largest connected component created by the first step and second step of the sampling process is at most 5 and 12, respectively. These connected components can then be connected together in the third step of the sampling to create larger connected components in . We show that with high probability, the third step would not connect more than 3 of the connected components of high degree vertices (which were created in the first step). This implies that the number of high degree vertices in any connected component of is at most with high probability.

We then show in Lemma 7 that for any , the expected number of vertices with at least edges in is at most . We then use the structural results of Lemma 6 and 7 to show that with probability at least , the size of the connected component of a randomly chosen vertex in is at most .

###### Lemma 6.

Let be an equilibrium network over vertices with with and . Suppose and is sufficiently large. Then, with probability at least , (1) the connected components generated in the first step and the second step of the sampling process of creating have size at most and , respectively and (2) no component in has more than 15 vertices from the set .

###### Proof.

Recall that in the first two steps of creating the , we sample edges to retain independently with probability from the graphs induced by only and , respectively. Since the number of high degree vertices is at most (where the notation hides logarithmic dependencies on ). So any vertex in the graph induced by has degree at most . Moreover, by definition, the vertices in the graph induced by have degree at most . By Corollary 1 (in Appendix A), with high probability, the random components formed by the first step and second step of the sampling process have size at most 5 and 12, respectively. We refer to the set of the components of that are formed by step one and two of the sampling by and , respectively.

Consider a component . The probability that there is an edge in between any vertex in and a specific high degree vertex is bounded by . This means that the probability that the vertices in are connected to more than one high degree vertex is at most . Similarly the probability that the vertices in are connected to more than two high degree vertices is at most . Therefore, with high probability, there is no component that is connected to three high degree vertices. Moreover, the probability that there are three connected components in that is connected to high degree vertices is at most .

In the third step of creating the , components from and would become connected by sampling the edges in between and . As we showed there are most two components in that will be connected in the by the edges sampled in the third step. This means, with high probability, no component in will include more than 3 components from ; so, with high probability, no component in has more than 15 high degree vertices. ∎

###### Lemma 7.

Let be an equilibrium network over vertices with and . For any , when , the expected number of vertices that have at least adjacent edges in is at most for sufficiently large .

###### Proof.

For a vertex with degree , the probability that she has edges in is at most

The first inequality is due to Stirling’s formula. Other inequalities are due to , and . Adding up the probabilities for all the vertices (using linearity of expectation) and using the fact that the sum of the degrees of all the vertices is will conclude the proof. ∎

We now have all the background to prove Lemma 5.

Proof of Lemma 5. If we randomly choose a vertex , the probability that the connected component of in has size at least is upper bounded by times the sum of the sizes of components with size at least in . This is in turn upper bounded by

where is the number of components with size at least .

Recall that we partitioned the vertices of into high and low degree vertex sets and based on the degree. We described a three step sampling process for creating and referred to the set of connected components of that are formed by step one and two of the sampling by and , respectively. Let be the event such that each has at most 12 vertices and each component in has at most high degree vertices. By Lemma 6, . Let be the number of vertices with at least edges in . By Lemma 7, .

Fix a component of . Conditioned on event , if all the high degree vertices in the component have degree at most , then each high degree vertex is connected to at most of the components in . So the size of this component is at most . Therefore, each component with size at least contains at least one vertex with at least adjacent edges in . This means that . So

So the probability that is in a component of size at least given is

So the overall probability that a vertex is in a component with size at least is at most . ∎

## 5 Social Welfare

In this section we provide a lower bound on the social welfare of equilibrium networks. Similar to other reachability games, the empty graph can form in equilibrium [2, 8]. Hence without any further assumptions, no meaningful guarantee on the social welfare can be made. Hence, we focus on non-trivial equilibrium networks defined as follows.

###### Definition 2.

An equilibrium network is non-trivial if it contains at least one edge.

Definition 2 rules out the empty network but it is still possible that a non-trivial equilibrium network contains many small connected components or becomes highly fragmented after the attack. In this section we show that none of these concerns materialize. In particular, in Theorem 5, we first show that any non-trivial equilibrium network contains at least one large connected component. We then show in Lemma 8 that when the network is not too dense, the equilibrium network cannot become highly fragmented after the attack. These two observations allows us to prove our social welfare lower bound as stated in Theorem 6.

We start by showing that any non-trivial equilibrium network contains a large connected component.

###### Theorem 5.

Let be a non-trivial equilibrium network over vertices. Then, for sufficiently large , the largest connected component of has at least vertices.

###### Proof.

Throughout we require .
Consider two cases: (1) when contains no isolated vertices^{4}^{4}4An
isolated vertex is a vertex with no incident edges.
and (2) when contains at least one isolated vertex.

In the first case, assume by way of contradiction that has at least connected components. Let be a smallest connected component in , say of size . Also let be a second-smallest connected component in , say of size . By construction, . Since has at least one edge and this edge has cost , the size of is at least . Otherwise the vertex who bought this edge can improve her utility by dropping this edge.

Consider the deviation that a vertex in adds an edge to an arbitrary vertex in . We show that the increase in the connectivity benefit of is more than . If the attack does not start at or , which occurs with probability , then the connectivity benefit of increases by . If the attack starts at , then the only way for the attack to reach is through the newly added edge by . Hence, in this case the connectivity benefit of does not decrease. Therefore, the only scenario in which the connectivity benefit of can decrease is when the attack starts at (which occurs with probability ). In this case, the connectivity benefit of can decrease by at most . So the change in the connectivity benefit of is at least

after the deviation. We show that which is a contradiction to being an equilibrium network.

We consider two sub-cases based on the value of : (1a) and (1b) . First consider case (1a) where . Since has at least connected components, then , and . So

Next consider case (1b) where . Since and then

when .

Therefore, the deviation of adding an edge by will increase the connectivity benefit of by strictly more than which is a contradiction. So contains at most connected components in case (1) and hence the largest connected component of contains at least vertices.

In case (2), we show that the largest connected component of contains at least vertices. Therefore, the largest connected component of contains at least vertices when .

Let be an isolated vertex and let be a largest connected component of , say of size . Consider the deviation where adds an edge to an arbitrary vertex in . If the attack does not start neither in nor at (which occurs with probability , then the connectivity benefit of increases by . The only scenario in which the connectivity benefit of can decrease is when the attack starts at (which occurs with probability ). In this case, the connectivity benefit of can decrease by at most . So the change in the connectivity benefit of after the deviation is at least

Note that , since is an equilibrium network. This implies that . To show that , consider the function . is increasing when and decreasing when . Moreover,

when . So is always larger than when . Since we showed in equilibrium it most be the case that either or . The former cannot happen because we assumed is non-empty, so must have at least one edge and therefore . Hence, as claimed.

∎

We next present Lemma 8 that describes the relationship between the expected size of the largest connected component of and the connectivity benefits of the vertices in .

###### Lemma 8.

Let be an equilibrium network over vertices. Let be any connected component in of size . If the expected size of the largest component of is at most , then the expected sum of the connectivity benefits of the vertices in is at least .

###### Proof.

With probability , the attack starts at a vertex outside of . In this case the sum of the connectivity benefits of vertices in is (the first term in the lower bound). Otherwise, with probability , the attack starts at a vertex in . We claim that in this case, the sum of the connectivity benefits of the vertices in is at least (the second term in the lower bound).

To prove the claim it suffices to show that if the largest component in has size , then the sum of connectivity benefits of the vertices in is at least . The claim would then follow by taking the expectation and using the assumption of the theorem that .

Assume by way of contradiction that the sum of connectivity benefits of vertices in is less than . Then there exists a connected component in such that if we delete the vertices in , then the sum of connectivity benefits of the vertices when the attack destroys is less than . Suppose has size and we know .

Let be the connected components in the subgraph of induced by and let denote their sizes, respectively (see Figure 2). Then by the assumption on the sum of connectivity benefits of the vertices after deleting , when the attack destroys we have that

(1) |

If the attack starts at a vertex in a component , then the vertices in will still remain connected. This means the sum of connectivity benefits of the vertices in is at least

Since and by Equation (1), the sum of the connectivity benefits of the vertices in is at least ; which is a contradiction. ∎

Theorem 5 and Lemma 8 allow us to prove a lower bound on the social welfare of non-trivial equilibrium networks.

###### Theorem 6.

Let be a non-trivial equilibrium network over vertices. For any and sufficiently large , if , then the social welfare of is at least .

###### Proof.

We show the expected sum of the connectivity benefits of the vertices in is at least . Subtracting off the cumulative expenditure for edge purchases which is then imply the statement of the theorem.

Suppose the largest connected component of say has size . By Theorem 5, . We consider two cases based on the size of : (1) and (2) .

In case (1), where , the sum of the connectivity benefits of the vertices in the largest connected component is at least . This is because with probability of the attack starts outside of the component and all the vertices in the component survive; in such case the sum of connectivity benefits of the vertices in is . Moreover, the derivative of with respect to is , which is positive when and negative when . Since , the minimum value of should be at one of the end points which correspond to values or , respectively. Both of these values are larger than when , which means the sum of connectivity benefits is at least in this case.

In case (2), where , the number of edges in the connected component is at most (which occurs when all the edges are in this component). Let us denote the vertices in by numbers from to . Let