 # Improved Distributed Approximation to Maximum Independent Set

We present improved results for approximating Maximum Independent Set () in the standard LOCAL and CONGEST models of distributed computing. Let n and Δ be the number of nodes and maximum degree in the input graph, respectively. Bar-Yehuda et al. [PODC 2017] showed that there is an algorithm in the CONGEST model that finds a Δ-approximation to in O((n,Δ) W) rounds, where (n,Δ) is the running time for finding a maximal independent set, and W is the maximum weight of a node in the network. Whether their algorithm is randomized or deterministic depends on the algorithm that they use as a black-box. Our results: (1) A deterministic O((n,Δ)) rounds algorithm for O(Δ)-approximation to in the CONGEST model. (2) A randomized 2^O(√( n)) rounds algorithm that finds, with high probability, an O(Δ)-approximation to in the CONGEST model. (3) An Ω(^*n) lower bound for any randomized algorithm that finds an independent set of size Ω(n/Δ) that succeeds with probability at least 1-1/ n, even for the LOCAL model. This hardness result applies for graphs of maximum degree Δ=O(n/^*n). One might wonder whether the same hardness result applies for low degree graphs. We rule out this possibility with our next result. (4) An O(1) rounds algorithm that finds an independent set of size Ω(n/Δ) in graphs with maximum degree Δ≤ n/ n, with high probability. Due to a lower bound of Ω(√( n/ n)) that was given by Kuhn, Moscibroda and Wattenhofer [JACM, 2016] on the number of rounds for finding a maximal independent set () in the LOCAL model, even for randomized algorithms, our second result implies that finding an O(Δ)-approximation to is strictly easier than .

Comments

There are no comments yet.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

One of the most fundamental problems in graph theory is the Maximal Independent Set problem (), where given an input graph, we need to find a maximal subset of the nodes where no two nodes in the subset are adjacent. In the sequential setting, the complexity of is well understood, as a simple greedy algorithm finds a maximal independent set in linear time. In distributed computing, remains one of the central open questions, and it has received a tremendous amount of attention in various models (see, for example,  [27, 25, 28, 10, 11, 12, 1, 42, 40, 38]). It is considered one of the four classic problems of local distributed algorithms, along with edge coloring, vertex coloring, and maximal matching [9, 24, 44].

Independent sets have many applications in practical and theoretical computer science. Especially maximum independent sets.111While the definition of a maximal independent set implies that any node in the graph is either in the set or has a neighbor in the set, a maximal independent set is not necessarily a maximum independent set. For example, in a star graph, the center is a maximal independent set, but not a maximum independent set. These include applications in economics , computational biology [17, 47], coding theory [15, 18]

, information retrieval, experimental design, signal transmission and computer vision

.

In the sequential setting, finding a maximum independent set () is an NP-hard problem . Even finding an -approximation, where is the maximum degree of the network, is hard, assuming the Unique Games Conjuncture . For -approximation, simple linear-time greedy algorithms exist. This suggests that the right approximation factor to aim for in general is between and . Interestingly, for any graph, any maximal independent set constitutes a -approximation to . This implies that cannot be easier than -approx to , regardless of the computational model. While in the sequential setting both and -approx to have the same complexity, in this work, we show that in the distributed setting, an -approx to is actually easier than .

### 1.1 Distributed Computing and Our Results

The major two models of distributed graph algorithms are the well-known LOCAL and CONGEST models. In the LOCAL model, there is a synchronized communication network of computationally-unbounded nodes, where each node has a unique identifier of bits. In each communication round, each node can send, a possibly different, unbounded-size message to each of its neighbors. The goal of the nodes is to perform some task (a.g., find a maximal independent set), while minimizing the number of communication rounds. The CONGEST model is similar to the LOCAL mode. The only difference is that in the CONGEST model the message-size is bounded by bits (see for example ).

In this work we study the complexity of finding an -approximation to MaxIS in the LOCAL and CONGEST models. In unweighed graphs, any maximal independent set constitutes a -approximation to MaxIS. The best currently known algorithms for finding an MIS in the CONGEST model are the classic -round algorithms due to [1, 42], and the recent -round algorithm by Ghaffari . For the LOCAL model, Ghaffari  presented an algorithm that takes rounds. All these algorithms are randomized that succeed with high probability.222We say that an algorithm succeeds with high probability if it succeeds with probability for an arbitrary constant .

In weighted graphs,333In weighted graphs, we are interested in finding independent set of a maximum total weight. Furthermore, in weighted graphs, the weights are assumed to be at most polynomial in . an MIS does not necessarily constitute a -approximation to MaxIS. For weighted graphs, Bar-Yehuda et al.  showed an algorithm that takes rounds, where is the round complexity of finding a maximal independent set in a graph of nodes and maximum degree , and is the maximum weight of a node in the graph. Whether the algorithm of  is randomized or deterministic, depends on the algorithm that they use as a black-box.

In this work, we improve the running time given by the result of , by paying a constant multiplicative overhead in the approximation factor, and we prove the following two theorems. We denote by the input weighted graph, where is the set of nodes, is the set of edges, and is the vertex-weight function. Let .

###### Theorem 1.

Given a weighted graph of nodes and maximum degree . There is a simple -round algorithm that finds an independent set of total weight at least , in the CONGEST model. Whether the algorithm is deterministic or randomized, depends on the algorithm that is used as a black-box.

###### Theorem 2.

Given a weighted graph of nodes and maximum degree , there is a rounds algorithm that finds an independent set in of total weight at least , with high probability, in the CONGEST model.

Given a lower bound of for finding an , even for randomized algorithms, by , Theorem 2 implies that finding an -approximation to is strictly easier than . Recently, Boppana et al.  showed that running a single round of Boppana’s classic algorithm444Boppana’s classic algorithm uniformly at random permute the vertices, and output the set of vertices that precede all their neighbors in the permutation. This algorithm first appeared in the book of Alon and Spencer , and is due to Bopanna (see also the references for this algorithm in ). results in an independent set of expected size . However, algorithms that work well in expectation do not necessarily work well with good probability. Actually, for the algorithm given by , it is not very hard to construct examples in which the variance of the solution is very high, in which case the algorithm does not return the expected value with high probability. In fact, we show the following stronger theorem for any algorithm.

###### Theorem 3.

Any algorithm that finds an independent set of size in unweighted graphs, with success probability must spend rounds, even in the LOCAL model.

Interestingly, this hardness result applies for graphs of maximum degree . One might wonder whether we can extend the lower bound for smaller maximum degree graphs. We rule out this possibility, with the following theorem.

###### Theorem 4.

Given an unweighted graph of maximum degree , there is an rounds algorithm that finds an independent set of size with high probability, in the CONGEST model.

The proof of Theorem 4 relies on a novel way to analyze Boppanna’s algorithm using martingales.

##### Road-map:

In the following section we provide a technical overview for our main result. Section 1.3 contains further related work. Section 1.4 contains some preliminaries. Section 2 contains our main result (Theorem 2). The proof of Theorem 1 also appears in Section 2, as we use it as part of the proof of Theorem 2. Our result for low-degree graphs is presented in Section 3. Our lower bound result is presented in Section 4. Finally, we conclude the paper with a discussion and open questions in Section 5.

### 1.2 Technical Overview

In this section we give a technical overview for our main result (Theorem 2). We first provide the high level idea for the unweighted case.

##### Unweighted Graphs:

Recall that and are the number of nodes and maximum degree of the input graph, respectively. The idea is to sample a subgraph of the input graph with the following properties. (1) The maximum degree in is small (). (2) The ratio between the number of nodes in and the maximum degree in is at least as in . That is . Given such a subgraph of , it suffices to find an in to get the desired approximation. Since has a small maximum degree, in order to find an in , we use Ghaffari’s recent algorithm  that finds an in rounds in the CONGEST model. Plugging in implies a running time of rounds, as desired. The sampling procedure for the unweighted case is simple. For simplicity, let us assume that the nodes know the maximum degree555In the actual algorithm, the nodes don’t need to know . It suffices that each node knows the maximum degree in its neighborhood, which is local information that the nodes can learn in one round in the CONGEST model. . Each node joins with probability , independently. It is not very hard to show, via standard Chernoff (Fact 1) and Union Bound arguments, that has the desired properties.

##### Weighted Graphs:

Perhaps the first thing that comes into mind when trying to extend the sampling technique to weighted graphs is to try to sample a subgraph of where the ratio between the total weight in and the max degree of is the same as in . However, there are a few challenges that arise when trying to apply this technique to weighted graphs. First, in the weighted case, an does not necessarily constitute a -approximation to . Therefore, even if we are able to sample a subgraph with the desired properties, running an algorithm on might result is an independent set of a very small weight. For this, we first prove, in Theorem 1, that while an does not imply an independent set of a -approximation in weighted graphs, there is a simple distributed algorithm that takes rounds that achieves the desired approximation.

Furthermore, the sampling procedure that was used for the unweighted case does not work for the weighted case. In particular, if we sample each node with probability , then low-weight nodes will have the same probability to join as high-weight nodes, which might result in a graph of a very small total weight. Intuitively, we need to take the weights into account in the sampling procedure. In fact, we show that it is enough to boost the sampling probability of a node by an additive factor of , where is the weight of and is the total sum of weights of nodes in the graph666In the actual algorithm, the nodes don’t need to know . It suffices that each node knows the total sum of nodes in its neighborhood, which is local information that the nodes can learn in one round in the CONGEST mode.. Due to this boosting of the sampling probability, and due to the fact that the total sum of weights of nodes in

is not a random variable that is a sum of

random variables as in the unweighted case, it does not suffice to use standard Chernoff and Union Bound arguments. Instead, we present a similar sampling procedure, but with a more involved analysis that uses Bernstein’s inequality (Fact 2).

### 1.3 Further Related Work

##### Distributed algorithms:

For computing an MIS, for many years, the only known algorithms were the classic ones by [1, 42] that take rounds, even for the CONGEST model. In recent breakthroughs, Barennboim et al.  presented a LOCAL algorithm that takes rounds, which was then improved by Ghaffari  to an rounds. More recently, Ghaffari  presented a CONGEST algorithm that takes rounds.

On the other hand, Kuhn et al.  showed a lower bound of , even for the LOCAL model. All the algorithms mentioned earlier for finding an are randomized that succeed with high probability. For deterministic algorithms, in  a -round algorithm is given using network decomposition, in the LOCAL model. In  a coloring-based -round CONGEST algorithm is given.

Recently, Ghaffari et al. , showed that there is an algorithm for the LOCAL model that finds a -approximation to in rounds, for a constant . The results in [39, 22] give a lower bound of rounds for any deterministic algorithm returning an independent set of size at least on a cycle. Furthermore,  provide a deterministic algorithm, and a randomized rounds algorithm, for -approximations in planar graphs.

Censor-Hillel et al.  showed that solving exact requires in the CONGEST model. More recently, Bachrach et al.  showed that computing a -approximation to MaxIS requires rounds, and that computing a -approximation requires rounds, in the CONGEST model.

##### Distributed algorithm achieving results in expectation

In , an -round randomized algorithm for an expected -approximation is presented for the unweighted case, along with a matching lower bound. Recently,  presented a single round algorithm for unweighted graphs achieving an approximation ratio of , where is the the Caro-Wei bound on , in the Beeping model among other results. The results in  provide a simple algorithm which achieves an expected -approximation for the weighted MaxIS in a single communication round in the CONGEST model.

##### Sequential algorithms

In the sequential setting, an excellent summary of the known results is given by , which we overview in what follows. For general graphs, the best known algorithm achieves an -approximation factor . Assuming ,  shows that there is no efficient -approximation algorithm for every constant .

When the degree is bounded by , a simple coloring based algorithm achieves a -approximation in linear time, even for weighted graphs. For unweighted graphs, a -approximation is achieved by greedily adding the node with minimal degree to the independent set and removing its neighbors . The best known approximation factor is [2, 30, 34, 31, 36]. Assuming the Unique Games Conjecture, there is no efficient algorithm that can achieve an approximation factor of . Assuming , a lower bound of on the approximation factor is given in .

### 1.4 Preliminaires

Some of our proofs use the following standard probabilistic tools. One great source for the following concentration bounds is the book by Alon and Spencer . These bounds can be also found in many lecture notes about Basic tail and concentration bounds.

###### Fact 1.

(Multiplicative Chernoff Bound). Let be independent random variables taking values in . Let denote their sum and let denote the sum’s expected value. Then for any , it holds that:

 Pr[|X−μ|≥ϵμ]≤2exp(−ϵ22+ϵμ)
###### Fact 2.

(Bernstein’s Inequality). Let be independent random variables such that . Let denote their sum and let denote the sum’s expected value. Then for any positive , it holds that:

 Pr[|X−μ|≥t]≤2exp(−t2/2Mt/3+∑ni=1Var(Xi))
###### Fact 3.

(One-sided Azuma’s Inequality). Suppose is a martingale and that almost surely. Then, for all positive integers and all positive reals ,

 Pr[XN−X0≤−t]≤exp(−t22∑Ni=1c2i)

To show our lower bound for randomized algorithms, we reduce from the following randomized lower bound for finding a maximal independent set in a cycle:

###### Theorem 5.

(Lower bound for the cycle ). Any randomized algorithm in the LOCAL model for maximal independent set that takes fewer than rounds succeeds with probability at most , even for a cycle of length .

##### Assumptions:

In all of our upper and lower bounds proofs, we don’t assume that the nodes have any global information. In particular, they don’t know or . The only information that each node has before the algorithm starts is its own identifier, and some polynomial upper bound on (Since the nodes can send bits in each round to each of their neighbors, naturally they know some polynomial upper bound on ).

##### Some denotations:

We denote by the neighborhood of (containing in the neighborhood), where is the set of neighbors of , . We denote by the number of neighbors of a node . We denote by the maximum degree of a node in the neighborhood of . That is, ). For a subset , we denote by the total weight of nodes in . That is, .

## 2 Upper Bound

In this section we present an algorithm that finds an -approximation to Maximum Weighted Independent Set in rounds with high probability. We first show a very simple, but slower, algorithm that achieves the same approximation ratio in rounds. Where is the complexity (in terms of number of rounds) of finding a maximal independent set in graphs of nodes and maximum degree . Then, we present an algorithm that uses algorithm as a subroutine to achieve a running time of rounds. Plugging in the recent algorithm by Ghaffari that finds a maximal independent set in rounds, implies a running time of .

### 2.1 Algorithm in O(MIS(n,Δ)) Rounds

In this section we prove the following theorem.

• Given a weighted graph of nodes and maximum degree . There is a simple -round algorithm that finds an independent set of total weight at least , in the CONGEST model. Whether the algorithm is deterministic or randomized, depends on the algorithm that is used as a black-box.

##### Algorithm

Let us first assume that the nodes know , and then we show how to remove this assumption. We say that a node is good if

 w(u)≥12(Δ+1)∑v∈N+(u)w(v).

Let be the set of good nodes, and let be a subgraph of where . The algorithm simply computes a maximal independent set in , and returns . The claim is that the returned independent set in of total weight . To prove this, we first prove, in Claim 1, that for any graph, the total weight of good nodes is at least half of the total weight in the graph. Then, in Lemma 1, we show that any maximal independent set in the subgraph induced by good nodes is of weight at least .

###### Claim 1.

Let , and let be the set of good nodes in . It holds that .

###### Proof.
 W(V∖VH)=∑u∈V∖VHw(u)≤∑u∈V∖VH12(Δ+1)∑v∈N+(u)w(v)≤12(Δ+1)∑u∈V(Δ+1)⋅w(u)=W(V)2.

###### Lemma 1.

Let , and let be the set of good nodes in , and let be a maximal independent set in . It holds that

 ∑u∈Uw(u)≥12(Δ+1)W(VH)≥14(Δ+1)W(V).
###### Proof.
 ∑u∈Uw(u)≥∑u∈U12(Δ+1)∑v∈N+(u)w(v)≥∑u∈U12(Δ+1)∑v∈N+(u)∩VHw(v)≥12(Δ+1)∑v∈VHw(v),

where the last inequality holds since is a maximal independent set in . Since we proved in Claim 1 that , this completes the proof. ∎

##### Removing the assumption that the nodes know Δ:

In order for the algorithm above to work, the nodes need to know , which is a global information about the graph. In order to remove this assumption, we can modify the definition of a good node as follows. Recall that is the maximum degree of a node in the neighborhood of . That is, , where . Call a node good if it holds that

 w(u)≥12(dmax(N+(u))+1)∑v∈N+(u)w(v).

One can easily verify that Claim 1 and Lemma 1 still hold under this definition of a good node. The main advantage of this definition is that the maximum degree in the neighborhood of a node is a local information that can be learned in one round in the CONGEST model.

##### Success with high probability:

Given a graph of nodes, an algorithm that finds a maximal independent set in with high probability is an algorithm that succeeds with probability at least for some constant . In the algorithm above, we are running a maximal independent set algorithm on a subgraph of . Since is potentially smaller than , one might wonder whether the algorithm above actually succeeds with high probability with respect to . The main idea is to use an algorithm that is intended to work for graphs with nodes, rather than nodes. We prove the following lemma, which is helpful for the results achieved in the following subsections as well, when we deal with subgraphs of .

###### Lemma 2.

Let be an -rounds algorithm that finds a maximal independent set with success probability , in a graph of nodes. Let be a graph of nodes with -bit identifiers, for some constant , and let be the maximum degree in . There is an -round algorithm that finds a maximal independent set in with success probability .

###### Proof.

The idea is to pad

with more vertices and then to run an algorithm for maximal independent set on the new graph. In fact, the easiest way to see this is to argue that finds a maximal independent set with high probability on the graph obtained by adding isolated nodes to with unique identifiers. Since any maximal independent set in induces a maximal independent set in , the claim follows. However, some of the algorithms in the CONGEST model assume that the input graph is connected777This assumption is usually made for global problems such as computing the diameter or all-pairs-shortest-paths. This is because global problems admit an lower bound, where is the diameter of the network, which is for disconnected graphs. While assuming connectivity might not seem reasonable for the MIS problem, for completeness, we want our reduction to hold even for algorithms that make this assumption.. To get around the connectivity issue, we define the graph obtained by adding a path of nodes with unique -bit identifiers to each node that is local minimum in (with respect to the identifiers). Each node that is added to a path connected to a local minimum , is given a unique identifier starting with the bits of the identifier of as the LSB’s (least significant bits), followed by another bits to ensure that the identifier is unique with respect to the other nodes on the same path. Observe that is a graph of nodes, with unique identifiers of bits. Hence, is an appropriate input to the CONGEST model. Furthermore, given a maximal independent set of , one can easily find a maximal independent set in , as follows. Let . Each node that is a local minimum in joins if none of its neighbors in is in . It holds that (after adding the additional nodes) is a maximal independent set in . Since the nodes in can easily simulate a maximal independent set algorithm in , without any additional communication cost, it follows that the total running time is , where and are the set of nodes and maximum degree in , respectively. Since , and , it holds that . Moreover, for any we know that for the specific problem of finding a maximal independent set it holds that . This is because the round-complexity of finding a maximal independent set is at most logarithmic in the number of nodes. Finally, since a maximal independent set algorithm in succeeds with probability , this completes the proof. ∎

Since the algorithm used to prove Theorem 1 is an -based algorithm, using Lemma 2, we can generalize Theorem 1 for graphs with number of nodes less than . Specifically, we obtain the following theorem as a corollary of Lemma 2 and Theorem 1, which we use as a black-box in the following subsections.

###### Theorem 6.

Given a weighted graph of nodes and maximum degree . There is an -rounds algorithm that finds an independent set in of total weight at least , with success probability .

### 2.2 Algorithm in O(MIS(n,logn)) Rounds

In this section we show an algorithm that finds an independent set of size in rounds. As a warm-up, in Section 2.2.1, we show the result for the unweighted case, and then, in Section 2.2.2 we show how to extend it for the weighted case as well. Both algorithms presented in sections 2.2.1 and  2.2.2, have the same following two-step structure.

1. First, we sample a subgraph of with the following two properties:

1. The maximum degree in is at most .

2. . That is, the ratio between the total weigh and maximum degree in is at least as in .

2. Then, we use Theorem 6 to find an independent set in of size , in rounds, with success probability at least .

#### 2.2.1 Warm-Up: Unweighted Graphs

In this section we prove the following theorem.

###### Theorem 7.

Given an unweighted graph of nodes and maximum degree , there is a rounds algorithm that finds an independent set in of size at least , with probability .

##### Algorithm:

We start by sampling a subgraph of , as follows. Recall that is the maximum degree of a node in the neighborhood of . Let be a constant. Each node joins with probability

 p(u)=clogndmax(N+(u)),

where each node with joins deterministically. Let . The algorithm finds a maximal independent set in , using Ghaffari’s algorithm , and returns . We prove the following two lemma’s about the properties of .

###### Lemma 3.

The maximum degree in is , with high probability.

###### Proof.

For any node with degree smaller than in , the claim follows trivially. Let be a node with degree higher than in , it holds that for any neighbor of , . This is because is a lower bound on for any neighbor of . This implies that for any neighbor of . Therefore, the expected number of neighbors of in is at most:

 ∑v∈N(u)p(v)=∑v∈N(u)clogndmax(N+(v))≤∑v∈N(u)clogn|N(u)|=clogn.

.

By applying Chernoff’s bound with a large enough constant (Fact 1), we conclude that the degree of a given node in is at most with high probability, and by applying a standard Union-Bound argument we achieve that the maximum degree of is , with high probability.

###### Lemma 4.

The number of nodes in is , with high probability.

###### Proof.

Let . The proof is split into two cases: (1) : in this case, at least nodes join , as any node in joins deterministically. (2) : observe that the expected number of nodes in is . Furthermore, since the number of nodes in is a sum of independent random variables, one can apply Chernoff’s bound (Fact 1) to achieve that the number of nodes in concentrates around its expectation (up to constant factors) with high probability. ∎

###### Proof of Theorem 7.

Since both Lemma 3 and 4 above hold with high probability, we can apply another standard Union-Bound argument to conclude that both of them hold with high probability (simultaneously). Therefore, by computing a maximal independent set in , we get an independent set of size at least , as desired. To finding a maximal independent set in with high probability, we use Lemma 2. By plugging in Ghaffari’s algorithm  in Lemma 2. This implies a running time of rounds, as desired. ∎

#### 2.2.2 Weighted Graphs

• Given a weighted graph of nodes and maximum degree , there is a rounds algorithm that finds an independent set in of total weight at least , with high probability, in the CONGEST model.

##### Main idea of the sampling

Recall that in the unweighted case, each node joins a subgraph with probability , where is the maximum degree of a node in the neighborhood of . It turns out that the same sampling method does not work for the weighted case. Here, we need to think about a weighted analog to . Recall that is the sum of weights of neighbors of , which can be considered as the weighted degree of . Hence, we define the weighted analog to as follows. Let . That is, is the maximum weighted degree of a node in the neighborhood of . Now we are ready to present the algorithm for the weighted case.

##### Algorithm:

We start by sampling a subgraph of , as follows. Let be a constant to be chosen later. Each node joins with probability

 p(u)=clogn⋅(1dmax(N+(u))+w(u)Wmax(N+(u))),

where each node with joins deterministically888Perhaps the first thing that comes into mind is to try to sample each node with probability , as this is the natural extension of , which works for the unweighted case. However, it turns out that sampling with probability might result in a subgraph of a total weight , which is too small for our purposes. It turns out that in order to get around this issue, it suffices to boost this sampling probability by an additive factor of .. We define . Using Theorem 6, the algorithm finds an independent set in of total weight at least , in rounds, and returns . It remains to show that . We show this in two separated lemmas. Lemma 5 shows that , and Lemma 7 shows that .

###### Lemma 5.

The maximum degree in is , with high probability.

###### Proof.

Let . We show that each node has at most neighbors in , and at most neighbors in . Hence, it implies that each node has at most neighbors in total in . Let be the set of neighbors of in .

1. : We prove a stronger claim, that . Assume towards a contradiction that there are more than nodes in . Since each node has . it holds that

 ∑v∈N(u)∩V+p(v)>2clogn

On the other hand, it holds that

 ∑v∈N(u)∩V+p(v)≤∑v∈N(u)p(v)=∑v∈N(u)clogn⋅(1dmax(N+(v))+w(v)Wmax(N+(v))).

Since and are lower bounds on and , respectively, for any neighbor of , it holds that

 ∑v∈N(u)clogn⋅(1dmax(N+(v))+w(v)Wmax(N+(v)))≤∑v∈N(u)clogn⋅(1|N(u)|+w(v)W(N(u)))=2clogn.
2. : The proof for this case is similar to the one for the unweighted case. Observe that the expected number of neighbors of in is

 ∑v∈N(u)p(v)≤2clogn.

As we showed in the previous case. Since is a sum of independent random variables, one can apply Chernoff’s bound (Fact 1) to achieve that this number concentrates around its expectation with high probability.

By applying a standard Union-Bound argument over all the nodes, we conclude that the maximum degree in is with high probability. ∎

The rest of this section is devoted to the task of proving that . First, we start by proving a slightly weaker lemma, that assumes that for all , . Later, we show how to remove this assumption in the proof of Lemma 7.

###### Lemma 6.

Assume , for any . It holds that , with high probability.

##### Road-map of the proof of Lemma 6:

Let be a sorting of the weights of nodes in in a decreasing order (where ties are broken arbitrarily). Let , and let . That is, contains the heaviest nodes, and contains all the other nodes. The proof is split into the following two cases that are proven separately in claims 2 and 3.

1. : In this case, at least a constant fraction of the total weight is distributed among high-weight nodes. Intuitively, we need to make sure that we get many of these high-weight nodes. Since the number of high-weight nodes that are sampled is a sum of independent random variables, we are able to use Chernoff’s bound to prove that many of them are sampled, with high probability. The full proof for this case is presented in Claim 2.

2. : In this case, at least half of the total weight is distributed among low-weight nodes. Therefore, it is sufficient to show that . The key property here is that we can bound the maximum weight of a node in by . We show how to use this property together with Bernstein’s inequality to prove Lemma 6 for this case. The full proof for this case is presented in Claim 3.

###### Claim 2.

Assume that for all , it holds that . Let . If , then , with high probability.

###### Proof.

Let . We start by showing that at least a constant fraction of the total weight in is distributed among nodes in . Let , we show that :

 W(S−)≤∑u∈S−w(u)≤∑S−W(V)4Δ≤W(V)4,

where the last inequality holds since . Therefore, . Next, we show that , by using Chernoff’s bound999One might wonder how can we actually show that , while in general the size of might be . The thing to note here is that if for all the nodes in the graph, then any subset of nodes with total weight must contain at least nodes. This is because if for all the nodes in the graph, it also implies that for all the nodes in the graph. Hence, in order to for a subset to have a total weight , it must contain nodes. Indeed, the proof of this claim relies on this assumption that is stated in the claim.. Let be a random variable indicating whether , and let . We show that the expectation of is at least .

 E[X]=∑u∈S+E[xu]=∑u∈S+p(u)=∑u∈S+clogn⋅(1dmax(N+(u))+w(u)Wmax(N+(u))) ≥∑u∈S+w(u)clognW(V)≥clognW(V)⋅∑u∈S+w(u)=W(S+)clognW(V)≥clogn4.

Furthermore, sine is a sum of independent random variables with expectation , by applying Chernoff’s bound (Fact 1), we conclude that there are at least nodes in , with high probability. Since each node in has weight at least , this implies that the total weight in is , with high probability, as desired. ∎

###### Claim 3.

Assume that for all , it holds that . Let . If , then , with high probability.

###### Proof.

Let be a random variable indicating whether , and let , and let . We prove the following 3 properties:

1. : his is because

 E[Y]=∑u∈Vlowp(u)⋅w(u)=∑u∈Vlowclogn⋅(1dmax(N+(u))+w(u)Wmax(N+(u)))⋅w(u) ≥∑u∈Vloww(u)clognΔ=W(Vlow)clognΔ≥W(V)clogn2Δ

where the last equality holds since .

2. For any , it holds that : This is because for any , it holds that

 wj⋅j≤j∑i=1wj≤∑v∈Vw(v)=W(V),

where the first inequality holds since is the minimum among . Hence, since each node in has weigh where , we have that for any .

3. It holds that : First, observe that

 ∑u∈VlowE[y2u]≤wmax(Vlow)⋅∑u∈VlowE[yu]=wmax(Vlow)⋅E[Y],

where is the maximum weight of a node in (i.e.,