Computing a maximal independent set or MIS problem in a network is one of the central problems in distributed computing. About 35 years ago, Alon, Babai, and Itai  and Luby  presented a randomized distributed algorithm for MIS, running on -node graphs in rounds with high probability.111Throughout, we use “with high probability (whp)” to mean with probability at least , for some constant . Since then the MIS problem has been studied extensively, and recently there has been some exciting progress in designing faster distributed MIS algorithms. For -node graphs with maximum degree , Ghaffari  presented a randomized MIS algorithm running in
improving over the algorithm of Barenboim et al.  that runs in
It was further improved by Rozhon and Ghaffari to
rounds [29, Corollary ].
While the above results constitute a significant improvement in our understanding of the round complexity of the MIS problem, it should be noted that in general graphs, the best-known running time is still (even for randomized algorithms). Furthermore, there is a lower bound of
due to Kuhn et al.  that also applies to randomized algorithms. Thus, for example, say, when , it follows that one cannot hope for algorithms faster than rounds. Balliu et al. showed recently that one cannot hope for algorithms that run within rounds for the regimes where (for randomized algorithms) [4, Corollary ] and (for deterministic algorithms) [4, Corollary ].
1.1 Energy considerations, Sleeping model, and Node-averaged round complexity
It is important to note that all prior works on MIS, including the ones mentioned above, are focused on measuring the worst-case number of rounds for nodes to finish. In other words, the time complexity is measured as the time (number of rounds) needed for the last (slowest) node(s) to finish. As mentioned above, the best-known bound for this measure is still for general graphs (even for randomized algorithms). In this paper, we take an alternative approach to designing MIS algorithms motivated by two main considerations.
The first consideration is the motivation of designing energy-efficient algorithms for ad hoc wireless and sensor networks. In such networks, a node’s energy consumption depends on the amount of time it is actively communicating with nodes; more importantly, significant energy is spent by a node even when it is just idle, i.e., waiting to hear from a neighbor. Experimental results show that the energy consumption in an idle state is only slightly smaller than that in a transmitting or receiving state [32, 10]. Thus, even though there might be no messages exchanged between a node and its sender, a node might be spending quite a bit of energy if it is just waiting to receive a message.
On the other hand, the energy consumption in the “sleeping” state, i.e., when it has switched off its communication devices and is not sending, receiving or listening, is significantly less than in the transmitting/receiving/idle (listening) state (see e.g., [32, 10, 17, 30, 31]). A node may cleverly enter and exit sleeping mode to save energy during the course of an algorithm. In fact, this has been exploited by protocols to save power in ad hoc wireless networks by judiciously switching between two states — sleeping and awake — as needed (the MAC layer provides support for switching between states [32, 31, 24]).
The second consideration, motivated by the first, is saving the total amount of energy spent by the nodes during the course of an algorithm. Note that in sleeping mode, we assume that there is no energy spent. Thus the total energy is measured as proportional to the total time (number of rounds) that nodes have spent in the “awake” or “normal” mode (i.e., non-sleeping mode). In this paper, we thus focus on minimizing the total number of rounds — or equivalently the average number of rounds — spent by all nodes in their awake state during an algorithm. Our goal is to design distributed algorithms with low node-averaged awake complexity (see Section 1.2).
Motivated by the above considerations, we posit the sleeping model for distributed algorithms which is a generalization of the traditional model (a more detailed description is given in Section 1.2). In the sleeping model, a node can be in either of the two states — sleeping or awake (or normal). While in the traditional model nodes are only in the awake state, in the sleeping model nodes have the option of entering sleeping state at any round as well as exiting the sleeping state and entering the awake state at a later round. In the sleeping state, a node does not send or receive messages and messages sent to it by other nodes are lost; it also does not do any local computation. If a node enters a sleeping state, then it is assumed that it does not incur any time or message cost (or other resource costs, such as energy). Some previous models (see e.g.,  and the references therein) assumed that nodes incur little or no energy only when they are not sending/receiving messages; however, this is not true in real-world ad hoc wireless and sensor networks, where considerable energy is spent even when nodes are “idle” or “listening” for messages. The sleeping model is more realistic, since in the sleeping state nodes turn off their communication (e.g., wireless) devices fully. However, it becomes more challenging to design efficient algorithms under this model.
1.2 Model and Complexity Measures
Before we define the sleeping model, we will recall the traditional model used in distributed algorithms.
We consider the standard synchronous Congest model , where nodes are always “awake” from the start of the algorithm (i.e., round zero). We are given a distributed network of nodes, modeled as an undirected graph . Each node hosts a processor with limited initial knowledge. We assume that nodes have unique IDs,222Making this assumption is not essential, but it simplifies presentation. and at the beginning of the computation each node is provided its ID as input. We assume that each node has ports (each port having a unique port number); each incident edge is connected to one distinct port. We also assume that nodes know , the number of nodes in the network. Thus, a node has only local knowledge.
Nodes are allowed to communicate through the edges of the graph and it is assumed that communication is synchronous and occurs in rounds. In particular, we assume that each node knows the current round number (starting from round 0). In each round, each node can perform some local computation (which finishes in the same round) including accessing a private source of randomness, and can exchange (possibly distinct) -bit messages with each of its neighboring nodes.
This model of distributed computation is called the Congest model or simply the Congest model . We note that our algorithms also, obviously apply to the Local model, another well-studied model  where there is no restriction on the size of the messages sent per edge per round. The Local (resp. Congest) model does not put any constraint on the computational power of the nodes, but we do not abuse this aspect: our algorithms perform only light-weight computations.
We augment the traditional Congest (or Local) model by allowing nodes to enter a sleeping state at any round. In the sleeping model, a node can be in either of the two states before it finishes executing the algorithm (locally) — in other words, before it enters a final “termination” state. That is, any node , can decide to sleep starting at any (specified) round of its choice; we assume all nodes know the correct round number whenever they are awake.333 One way to implement the sleeping model is to assume that a node’s local (synchronized) clock is always running; a node before it enters the sleeping state, sets an “interrupt” (alarm) to wake at a specified later round. In practice, the IEEE 802.11 MAC provides low-level support for power management and synchronizing nodes to wake up for data delivery [32, 24, 31]. It can wake up again later at any specified round — this is the awake state. In the sleeping state, a node can be considered “dead” so to speak: it does not send or receive messages, nor it does any local computation. Messages sent to it by other nodes when it was sleeping are lost. However, a node can awake itself at any specified later round. Note that in the traditional model, which can be considered as a special case of the sleeping model, nodes are always in the awake state. In the sleeping model, a node can potentially conserve its resources by judiciously determining if, when, and how long to sleep.
Node-averaged Round Complexity.
For a distributed algorithm on a network in the sleeping model, we are primarily interested in the “node-averaged awake round complexity” or simply “node-averaged awake complexity”. For a deterministic algorithm, for a node , let be the number of “awake” rounds needed for to finish, i.e., only counts the number of rounds in the awake state of . Then we define the node-averaged awake complexity to be .
For a randomized algorithm, for a node , let
be the random variable denoting the number of awake rounds needed forto finish. Then let the random variable be defined as , i.e., the average of the random variables. Then the expected node-averaged awake complexity of the randomized algorithm is
In this paper, we are mainly focused on this measure. However, one can also study other properties of , e.g., high probability bounds on .
Note that analogous definitions also naturally apply to the node-averaged round complexity444Note that henceforth when we don’t specify “awake” in the complexity measure, it means that we are referring to the traditional model, and if we do, we are referring to the sleeping model. in the traditional model, where all rounds are counted (since nodes are always awake).
Worst-case Round Complexity.
We measure the “worst-case awake round complexity” (or simply the “worst-case awake complexity”) in the sleeping model as the worst-case number of awake rounds (from the start) taken by a node to finish the algorithm. That is, if be the number of awake rounds of before it terminates, then the worst-case awake complexity is .
While our goal is to design distributed algorithms that are efficient with respect to node-averaged awake complexity, we would also like them to be efficient (as much as possible) with respect to the worst-case awake complexity, as well as the traditional worst-case round complexity, where all rounds (including rounds spent in sleeping state) are counted.
1.3 MIS in -rounds node average complexity?
In light of the difficulty in breaking the -round (traditional) worst-case barrier and the lower bound for worst-case round complexity, as well as motivated by energy considerations discussed above, a fundamental question that we seek to answer is this:
Can we design a distributed MIS algorithm that takes -rounds node-averaged awake complexity?
Before we answer this question, it is worth studying the node-averaged round complexity of some well-known distributed MIS algorithms in the traditional model. It is not clear whether Luby’s algorithms (both versions of it [27, 20]) give -round node-averaged complexity, or even -round node-averaged complexity. The same is the situation with the algorithms of Alon et al.  and Karp et al.  as well as known deterministic MIS algorithms [3, 26] (see also [5, 29]). The algorithm of  (also of Barenboim et al. ) does not seem to give (or even ) node-averaged complexity. For example, take Ghaffari’s algorithm , which is well-suited for analyzing the node-averaged complexity since it is “node centric”: for any node , it gives a probabilistic bound on when will finish. More precisely, it shows that for each node , the probability that has not finished (i.e., its status has not been determined) after rounds is at most . Using this it is easy to compute the (expected) node-averaged complexity of Ghaffari’s algorithm. However, this is still only , as can be for most nodes.
Recently Barenboim and Tzur  showed that MIS can be solved in rounds under node-averaged complexity deterministically, where is the arboricity of the graph. It is a open question whether one can design an algorithm with (or even ) node-averaged round complexity in the traditional model for general graphs (which can have arboricity as high as ). Hence a new approach is needed to show , in particular, node-averaged round complexity.
1.4 Our Contributions
Our main contributions are positing the sleeping model and designing algorithms with constant-rounds node-averaged awake complexity algorithm for MIS in the model.
|Prior MIS algorithms (e.g., Luby’s [20, 2], CRT [9, 7, 12], etc.)||Our algorithms|
Our main result is a randomized distributed MIS algorithm in the sleeping model whose (expected) node-averaged awake complexity is . In particular, we present a randomized distributed algorithm (Algorithm 2) that has -rounds expected node-averaged awake complexity and, with high probability, has -round worst-case awake complexity, and worst-case (traditional) round complexity (cf. Theorem 2). We refer to Table 1 for a comparison of the results. Please also see Theorem 1 and Theorem 2, respectively.
Our work is also a step towards understanding whether a -round node-averaged algorithm is possible in the traditional model (without sleeping).
1.5 Comparison with Related Work
Much of the research in design and analysis of efficient distributed algorithms and proving lower bounds of such algorithms in the last three decades have focused on the worst-case round complexity. Recently, there have been a few works that have focused on studying various fundamental distributed computing problems under a node-average round complexity (in the traditional model).
The notion of node-averaged (or vertex-averaged) complexity for the traditional model was proposed by Feuilloley  and further studied (with slight modifications) in Barenboim and Tzur . The motivation for node-averaged complexity — which also applies here — is that it can better capture the performance of distributed algorithms vis-a-vis the resources expended by individual nodes .
In Feuilloley’s notion , a node’s running time is counted only till it outputs (or commits its output); the node may still participate later (e.g., can forward messages etc.) but the time after it decides its output is not counted. In other words, the node-averaged complexity is the average of the runtimes of the nodes, where a node’s runtime is till it decides its output (though it may not have terminated). The average running time is the average of the running times under the above notion. This work  studies the average time complexity of leader election and coloring algorithms on cycles and other specific sparse graphs. For leader election, the paper shows gives an algorithm with node-average complexity (we note that the worst-case time complexity has a lower bound of , even for randomized algorithms ). For 3-coloring a cycle, the paper  shows that the node-averaged complexity cannot be improved over the worst-case, i.e., it is .
Following the work of Feuilloley , Barenboim and Tzur  address several fundamental problems under node-averaged round complexity. Their notion is somewhat different from that of Feuilloley — in , as soon as a node decides its output, it sends its output to its neighbors and terminates (does not take any further part in the algorithm). This is arguably a more suitable version for real-world networks in light of what was discussed in the context of saving energy and other node resources. Our notion is the same as that of Barenboim and Tzur, but extended to apply to the more general sleeping model where time spent by nodes in the sleeping state (if any) is not counted.
Barenboim and Tzur show a number of results for node-coloring as well as for MIS, -edge-coloring and maximal matching. Their bounds apply to general graphs, but depend on the arboricity of the graph. In particular, for MIS, -edge-coloring and maximal matching, they show a deterministic algorithm with node-average complexity, where is the arboricity of the graph (which can be in general).
It is still not known whether one can obtain (or even ) round node-averaged complexity for MIS in general graphs in the traditional model (without sleeping). In this paper, we answer this question in the affirmative in the sleeping model. Note that -coloring can be solved in round node-averaged complexity in general graphs by using Luby’s -coloring algorithm , e.g., see the paper of Barenboim and Tzur [6, Section ]; however, this does not imply any such bound for MIS.
There is also another important distinction between the algorithms of this paper and those of two works discussed above of [11, 6]: some of the algorithms in the above works assume the Local model (unbounded messages), whereas Congest model (small-sized messages) is assumed here.
Finally, we point out that in a dynamic network model the work of  analyzed the average time complexity of algorithms using amortized analysis. This model is dynamic where nodes and edges may be added or deleted from the graph and is different compared to the static setting studied in this paper which is the case with almost all prior works on MIS mentioned here.
The work of King et al.  uses a sleeping model similar to ours — nodes can be in two states sleeping or awake (listening and/or sending) — but their setting is different. They present an algorithm in this model to solve a reliable broadcast problem in an energy-efficient way where nodes are awake only a fraction of the time. Another different model studied in the literature for problems such as MIS is the beeping model (see, e.g., ) where nodes can communicate (broadcast) to their neighbors by either beeping or not. Sleeping is orthogonal to beeping and one can study a model that uses both.
2 Challenges and High-level overview
Before we go to an overview of our algorithm, we discuss some of the challenges in obtaining an -round node-averaged complexity in the traditional model. As mentioned in Section 1.3, either prior distributed MIS algorithms have node-average complexity or it is not clear if their node-averaged complexity is (even) for arbitrary graphs. A straightforward way to show constant node-averaged round complexity is to argue that a constant fraction (on expectation, for randomized algorithms) of the nodes finish in every round; this can be shown to imply node-averaged complexity. Indeed, this is the reason why Luby’s (randomized) algorithm for )-coloring  gives node-averaged complexity (see [6, Section ]). It is not clear whether this kind of property can be shown for existing MIS algorithms (see Section 1.3).
Our main contribution is to show how one can design an algorithm with -rounds node-averaged awake complexity for MIS in the sleeping model while still having small worst-case running times (both in the sleeping and traditional models). This is non-trivial since it is not obvious how to take advantage of the sleeping model properly. A key difficulty in showing an node-averaged awake complexity for the MIS problem in the sleeping model is that messages sent to a sleeping node are simply ignored; they are not received at a later point. Hence a sleeping node is unable to know the status of its neighbors (even after it is awake, since the neighbors could be then sleeping or even finished). Another difficulty is that it is not clear when to wake up a sleeping node; and when it wakes up, its neighbors might be sleeping and it won’t know their status. It could be costly (in terms of node-averaged awake complexity) to keep all of its neighbors awake for many rounds. Hence a new approach is needed to get constant node-averaged awake complexity.
The high-level idea of our (randomized) algorithm is quite simple and can be explained by a simple recursive procedure (see Figure 1). Consider a graph . Every node flips a fair coin. If the coin comes up heads, the node falls asleep. Otherwise, it stays awake. Let be the set of sleeping nodes and be the set of awake nodes. The procedure is invoked recursively on the subgraph induced by to compute an MIS of that subgraph. The recursion bottoms out when the procedure is invoked on an empty subgraph or a subgraph containing only a single node. In the latter case, the node joins the MIS.
Once is determined, the nodes in wake up at an appropriate time — that is synchronized to the time when the recursive call on returns – and every node in informs its neighbors that its in the MIS. At this point, the status (i.e., whether a node is in the MIS or not) of the nodes in and their neighbors (including those in ) is fixed, so all of these nodes terminate.
It remains to fix the status of the remaining nodes, which form a subset . To do so, we recursively invoke the procedure on , which gives us an MIS of that subgraph. The overall MIS is then given by . The recursion bottoms out when the procedure is invoked on an empty subgraph or a subgraph containing only a single node. In the latter case, the node joins the MIS.
The main observation for the analysis of this procedure is the following. By definition, dominates and therefore all nodes in terminate after the first recursive call is finished. On top of that, the nodes in might also dominate some of the nodes in . In fact, one can show that on on expectation at least a -fraction of the nodes in have a neighbor in and hence will be eliminated when the recursive call finishes. This is shown in the key technical lemma called the Pruning Lemma (see Lemma 3).
A main challenge in proving Lemma 3 is that is fixed by sampling (coin tosses) and the MIS of does not depend on , the set of sleeping nodes. Yet we would like to show that a constant fraction of nodes in have a neighbor in the MIS of , i.e., . Note that given , can possibly be such that the number of neighbors in can be very small; this will not eliminate many nodes in . We avoid this by coupling the process of sampling with the process of finding an MIS and show that, despite choosing first (by random sampling), the MIS computed on will eliminate a constant fraction of . However, choosing first introduces dependencies which makes it non-trivial to prove the Pruning Lemma; we overcome this by using the principle of deferred decisions (cf. proof of Lemma 3).
The Pruning Lemma (see Lemma 3) guarantees that a constant fraction of nodes in the graph terminate without being included in either of the two recursive calls. So the status of these nodes is fixed by being awake for only a constant number of rounds (only three rounds). As a consequence, the two recursive calls together only operate on at most -fraction of the given nodes on expectation. This saving propagates down the tree of recursive calls such that at level of the tree the overall number of nodes on which the calls at that level operate is at most . It is not hard to see that the number of rounds per vertex required by the procedure outside of the recursive calls is constant. Therefore, the overall expected vertex-average complexity of the procedure in the sleeping model is: .
The above algorithm (cf. Section 3) has constant round node-averaged complexity and -rounds worst-case awake complexity, but it has polynomial worst-case complexity. We then show that our MIS algorithm can be combined with a variant of Luby’s algorithm so that the worst-case (traditional) complexity can be improved to polylogarithmic () rounds, while still having rounds node-averaged complexity and worst-case awake complexity.
3 The Sleeping MIS Algorithm
We consider the algorithm given in Figure 1. To compute an MIS for a given graph , each node in calls the function SleepingMIS at the same time. The function SleepingMIS is called with the function parameter , where is the network size. We show in the analysis (see Lemma 1 and Theorem 1) that our algorithm is correct with high probability.
After initializing some variables, the function calls the recursive function SleepingMISRecursive. This function computes an MIS on the subgraph induced by the set of nodes that call it. In the initial call, all nodes in call the function. In later calls, however, the function operates on proper subgraphs of . Each (non-trivial) call of SleepingMISRecursive partitions the set of nodes participating in the call into two subsets. The function uses a recursive call to compute an MIS on the subgraph induced by the first set, updates the nodes in the second set about the result of the first recursive call, and finally uses a second recursive call to finish the computation of the MIS. SleepingMISRecursive takes an integer parameter . This parameter starts with in the initial call. The function parameter is then decremented from one level of the recursion to the next until the recursion base with is reached.
During the algorithm, each node in stores a variable . Initially, this variable is set to unknown to signify that it has not yet been determined whether is in the MIS or not. Over the course of the algorithm the variable is set to true or false. Once has been set to one of these values, it is never changed again. The MIS computed by the algorithm is given by the set of nodes with after termination.
Consider a call of the function SleepingMISRecursive by a node set and with a parameter . The goal of such a call is to compute an MIS in the induced subgraph . The function consists of six parts as indicated by the comments on the right side in Algorithm 1. The first part (Lines 9 – 12) is the base case of the recursion. Once the base case of the recursion is reached it holds, with high probability, that . Therefore, the function has to compute an MIS on a graph consisting of at most one node. Such a node simply joins the MIS.
The second part (Lines 13–16) detects isolated nodes and adds them to the MIS. We refer to this part as the first isolated node detection. Note that on the top level of the recursion, which operates on the entire graph , the nodes do not have to communicate in order to determine whether a node is isolated or not. On lower levels of the recursion, however, the function generally operates on a proper subgraph of . The given instructions make sure that a node correctly determines its neighborhood in .
The third part (Lines 17–21) uses a recursive call to compute a partial MIS. We call this part the left recursion due to the intuition of organizing the recursive calls of the function into a binary tree that is traversed in a left-to-right order (see Figure 1). Every non-isolated node with participates in this recursive call, where is a variable that is set to a random bit during initialization. The recursive call computes an MIS in the subgraph induced by these nodes. In doing so, it fixes the value of for every node with . The nodes with and all isolated nodes in sleep for the duration of the left recursive call, see Line 20. Thereby, all nodes in start executing the next part of the function (which begins in Line 22) at the same time.
The purpose of the fourth and fifth part of the function is to update the nodes with about the decisions made in the left recursive call. We call the fourth part (Lines 22–25) the elimination step. In this part, every node with checks whether it has a neighbor in that is in the MIS. If that is the case, sets to false. The fifth part (Lines 26 – 29) is the second isolated node detection. In this part, a node checks whether the variable is false for every neighbor of in . If so, sets to true.
The sixth and final part of the function (Lines 30–34) uses a second recursive call to complete the computation of the MIS in . We refer to this part as the right recursion. As in the left recursion, only a subset of the nodes participates in the recursive call while the other nodes sleep. Specifically, every node for which the value of is still unknown participates in the right recursive call.
One important technical issue is synchronization. The sleeping nodes wake up at the appropriately synchronized round to synchronize with the wake up nodes (at the end of their recursive calls — (Lines 18 and 31) so that messages can be exchanged between neighbors. A node that calls SleepingMISRecursive() will wake up at a later round, which is given by the function . is the (worst-case) time taken to complete this recursive call. The function as computed in the proof of Lemma 10.
We now turn to the analysis of Algorithm 1.
On an -node graph , the set computed by the algorithm SleepingMIS(K), with , is an MIS with high probability.
We show the statement by induction. We use the following induction hypothesis: If all nodes in a set simultaneously call the function SleepingMIS with parameter then
all nodes in return from the call in the same round,
after the call, the variable inMIS is true or false for all nodes in , and
the set of nodes in with inMIS true is an MIS of the subgraph induced by .
We begin by showing the induction step, i.e., we show that the above statement holds for node set and parameter under the assumption that it holds for any subset of and parameter .
For Condition 1 of the statement observe that outside of the recursive parts, all nodes perform the same instructions and, therefore, spend the same number of rounds. In the recursive parts, the nodes that participate in the recursive calls all return in the same round according to the induction hypothesis, and the nodes that do not participate in the recursive call sleep for the exact number of rounds required for the recursive call. Therefore, Condition 1 holds.
To see that Condition 2 holds, observe that in Line 30 every node in with inMIS unknown performs a recursive call so that the value of inMIS is set to true or false for all of these nodes according to the induction hypothesis.
It remains to show that Condition 3 holds, i.e., the algorithm actually computes an MIS on the subgraph induced by . Let be the set of isolated nodes in . The algorithm explicitly takes care of these nodes during the (first) isolated node detection (Lines 13 - 16). The remaining nodes are partitioned into two sets and where is the set of nodes that participate in the first recursive call, and is the set of remaining non-isolated nodes.
Finally, let be the set of nodes that participate in the second recursive call. To show that is an MIS we first show that is a dominating set in and then show that is an independent set in .
To show that is a dominating set in , we show that for every node either is in or a neighbor of is in . If then joins the MIS during the first isolated node detection. If then, according to the induction hypothesis, either joins the MIS during the first recursive call or some neighbor of that is also in joins the MIS.
If we have to distinguish between two subcases:
If then we have two possibilities: (i) was set to false during the elimination step (Lines 22 – 25) and, therefore, must have a neighbor that is in and joined the MIS; (ii) was set to true during the second isolated node detection (Lines 26–29), and therefore is in the MIS (and dominated).
To show that is an independent set in , we prove that for any edge in if then . If then the statement must hold according to the induction hypothesis. If and then is set to false during the elimination step (Lines 22 -25) and the statement holds. If and then was not set to false (i.e., set to true) during second isolated node detection (Lines 26 - 29) and, therefore, we must have .
For the case we have to consider two subcases:
Otherwise (i.e..both and are in ), the statement holds by the induction hypothesis.
Finally, we consider the base case of the induction which corresponds to the base case of the recursion where . Recall that at the top level of the recursion, the function SleepingMISRecursive is called with the parameter set to , and then is decremented from one level of the recursion to the next. Consider the recursion tree corresponding to the execution of the algorithm. Each tree node corresponds to a call of the function SleepingMISRecursive by a subset of the nodes in . The root of the tree corresponds to the initial call made by all nodes in . At an internal tree node, the probability that a non-isolated node in participates in the first recursive call is , and the probability that a node in participates in the second recursive call is at most . An isolated node does not participate in any recursive calls. Therefore, the probability that a node in participates in a specific call of SleepingMISRecursive corresponding to a leaf of the recursion tree is at most . So for any pair of nodes in the probability that both nodes participate in the same recursive call at the bottom level of the recursion is at most . Applying the union bound over all pairs of nodes implies that, with high probability, at most one node participates in any given recursive call at the bottom level of the recursion. Therefore, if a node reaches the base case of the recursion then the function SleepingMISRecursive operates on a subgraph of containing only , w.h.p. The node simply joins the MIS (see Lines 9–12). It is easy to check that, thereby, all three conditions of induction hypothesis are satisfied. ∎
4.2 Time (or Round) Complexity
The intuition behind the node-averaged running time analysis is that in each recursive call of SleepingMISRecursive, a constant fraction (on expectation) of the nodes calling the function participate in neither the left nor the right recursive call. Thereby, these nodes sleep for almost the entire duration of the call. This effect propagates through the levels of recursion and ultimately leads to a low node-averaged awake complexity.
To formalize this intuition, consider the execution of SleepingMISRecursive by a set of nodes and with parameter . Let be the set of nodes that participate in the left recursive call, and let be the set of nodes that participate in the right recursive call. The following lemma bounds the number of nodes participating in the left recursive call.
Consider a node . If is isolated in , it joins the MIS in the first isolated node detection and, thereby, does not participate in the left recursive call. Otherwise, partipates in the left recursive call if and only if , which holds with probability . The lemma follows by the linearity of expectation. ∎
The next is a key lemma that establishes a bound for the right recursive call. The proof of this lemma requires a series of definitions and auxiliary lemmas.
Lemma 3 (Pruning Lemma).
Thereby, on expectation, at least one-fourth of the nodes in do not participate in either of the recursive calls.
We now establish the definitions and lemmas necessary to prove Lemma 3. We recall from the algorithm that is the maximum recursion depth.
For any such that and any node we define the -rank of as the sequence
For two nodes we write if and only if is lexicographically strictly less than . (Note that , is simply a sentinel value for the base case.)
Let denotes the neighborhood of in the (sub)graph . The following lemma provides us with a condition under which a node joins the MIS based on ranks.
Consider a call of SleepingMISRecursive by a node set with parameter during the execution of the algorithm. If is such that every with sets to false at some point during the algorithm, then joins the MIS.
We prove the lemma by induction on .
For the induction step we assume that the statement holds for and show that it holds for . Consider a call of SleepingMISRecursive by a node set with parameter , and let be such that every with sets to false at some point during the algorithm. If is isolated in then joins the MIS in the first isolated node detection. Otherwise, we distinguish two cases based on the value of .
If then participates in the left recursive call with parameter . Let be the set of ’s neighbors in that also participate in the left recursive call. We show that each with sets to false at some point.
Consider a node . If then it holds since . If then sets to false at some point by definition. Thereby, the induction hypothesis implies that joins the MIS.
If then sleeps during the left recursive call. A node with must have . Thereby, also sleeps during the left recursive call, and it cannot cause to set to false in the synchronization step. A node with never sets to true by definition. Thereby, such a node also cannot cause to set to false in the synchronization step. If after the synchronization step it holds for every , then joins the MIS in the second isolated node detection.
Otherwise, participates in the right recursive call. Let be the set of ’s neighbors in that also participate in the right recursive call. We show that each with sets to false at some point.
Consider a node . If then it holds since . If then sets to false at some point by definition. Thereby, the induction hypothesis implies that joins the MIS. ∎
Consider a call of SleepingMISRecursive by a node set with parameter during the execution of the algorithm. Let be the sequence of the nodes in sorted by lexicographically decreasing -rank. We refer to this sequence as the evaluation sequence.
Note that the order of the nodes in the evaluation sequence is defined based on the -rank of the nodes and not the -rank. Intuitively, the -rank of a node is the sequence of random decisions used in future recursions. The following lemma shows that the evaluation sequence is well defined.
For every call of SleepingMISRecursive by a node set with parameter during the execution of the algorithm and for any , it holds with high probability that .
For two nodes to participate in the same call with node set and parameter , we must have for all such that . For both nodes to have the same -rank, we must have for all such that . Therefore, we have
The statement holds by applying the union bound over all pairs of nodes. ∎
Using the principle of “deferred decisions”.
The proof of Lemma 3 is based on the principle of deferred decisions555For an introduction to the principle see e.g., .: We still focus on a call of SleepingMISRecursive by a node set with parameter during the execution of the algorithm. We assume for the sake of analysis that the values of the variables are not fixed at the beginning of the algorithm. However, the remaining for are fixed, which means that the set , the -ranks of the nodes, and the evaluation sequence have all been determined. We then fix the values of the using the following deferred decision process.
Note that the above process does not change the behavior of the algorithm. It merely serves as a useful tool that allows us to analyze the algorithm as demonstrated in the following lemma.
The following statements hold for every node .
If is sequence-fixed and , then sets to true before the synchronization step.
If is neighbor-fixed, then sets to false before the second isolated node detection.
We show the lemma by complete induction on the evaluation sequence . The node is sequence-fixed by definition. If , none of the statements apply and the lemma holds. If , we have to show that Statement 1 holds. By definition of the evaluation sequence we have for all . Together with this implies that for all . Therefore, Lemma 4 implies that joins the MIS. Since , must set to true in the first isolated node detection or the left recursive call. Thereby, Statement 1 and the induction base hold.
For the induction step we consider an index and show that the lemma holds for under the assumption that it holds for all with . We first show that Statement 1 holds. Suppose that is sequence-fixed and . We have to show that sets to true before the synchronization step. Consider a node . If then we must also have because . If then the lemma holds for according to the induction hypothesis. We distinguish two cases. If is sequence-fixed then we must have because otherwise would be neighbor-fixed, which is a contradiction. Therefore, we have in this case. If is neighbor-fixed then sets to false before the second isolated node detection according to Statement 2. In summary, we have that for every node it holds or sets to false at some point during the algorithm. Thereby, Lemma 4 implies that joins the MIS. Since , must set to true in the first isolated node detection or the left recursive call. Therefore, Statement 1 holds.
Finally, we show that Statement 2 holds. Suppose that is neighbor-fixed. We have to show that sets to false before the second isolated node detection. Since is neighbor-fixed, it must have a neighbor such that , is sequence-fixed, and . Since the lemma holds for according to the induction hypothesis. Because is sequence-fixed and , Statement 1 implies that sets to true before the synchronization step. Since the algorithm is correct (see Lemma 1), this implies that must eventually set to false. If then must set to false in the left recursive call. Otherwise, learns during the synchronization step that set to true and, therefore, sets to false in the synchronization step. Thereby, Statement 2 holds. ∎
We are finally ready to prove Lemma 3.
Proof of Lemma 3.
Consider a call of SleepingMISRecursive by a node set with parameter during the execution of the algorithm. We show that for each it holds . The linearity of expectation then implies , so the lemma holds.
If is isolated in , then joins the MIS in the first isolated node detection and we have . Thereby, the statement holds for isolated nodes. If is not isolated, we can apply the law of total probability to obtain
By definition, it holds and . Therefore, we have
We define as the event that all neighbors in are neighbor-fixed. Recall that is not isolated. We apply the law of total probability again to obtain
We first determine the probability . If , cannot cause any of its neighbors to be neighbor-fixed. Hence, if the event occurs then every node in must have been neighbor-fixed by some other node. In this case, Lemma 6 implies that every node sets to false before the second isolated node detection. Hence, sets to true during the second isolated node detection. This implies that .
Next, we establish an upper bound on . If occurs, then there is a node that is sequence-fixed. With probability , we have . In this case, Lemma 6 implies that sets to true before the synchronization step. Thereby, sets to false during the synchronization step. This implies .
By combining the equations above, we get and, therefore, . ∎
Consider all calls of SleepingMISRecursive with the same parameter . Let be the set of nodes participating in the -th of these calls. Note that all the are pairwise disjoint by definition. We define
to be the number of nodes participating in the recursive calls with parameter . Then the following holds for all such that :
We show the lemma by induction on . For the induction base consider . This case corresponds to the initial call of the recursive function by all nodes. Thus, we have by definition, which shows the induction base.
For the induction step we assume that the equation holds for an such that and show that it also holds for . Consider the calls of SleepingMISRecursive with parameter during the algorithm. Let be the number of these calls, and let be the set of nodes participating in the -th call. By definition, we have
Each of the calls creates a left and right recursive call at the next level of recursion. For the call -th call let be the node sets that participate in the left and right recursive call, respectively. By definition, we have
This implies that
The expected node-averaged awake complexity of Algorithm 1 in the sleeping model is .
For each node , we define a cost which is the number of rounds is awake during the execution of the algorithm. We define to be the total cost of the algorithm. The expected node-average complexity is then .
The initialization at the beginning of the algorithm incurs a cost of for each node. To compute the cost of the recursive part of the algorithm, we use the following observation: Consider a node and a call of SleepingMISRecursive that participates in. If we ignore the cost of the recursive calls, the participation of in the call incurs a cost of . This implies that we can compute the total cost of the recursive part by computing the number of nodes that participate in the recursive calls at the different levels of the recursion. Recall that we defined the number of nodes participating in the calls with common parameter as . Formally, we have
Taking the expectation on both sides and applying Lemma 7 gives us
Therefore, the expected node-average awake complexity of the algorithm is . ∎
4.3 Analysis of the worst-case awake complexity of Algorithm 1
The worst-case awake round complexity of Algorithm 1 is .
The worst-case awake round complexity can be seen to be proportional to the depth of the recursion tree of the Algorithm SleepingMISRecursive(). We note that the activation of nodes proceeds in a depth-first, left-to-right fashion. Also, each node is awake only a constant number of rounds for possibly every level of the recursion tree. Thus, any node can be awake at most a number of rounds proportional to the depth of the recursion, which is rounds. ∎
The worst-case round complexity of Algorithm 1 is .
The worst-case round complexity of Algorithm SleepingMISRecursive() can be computed recursively. Let be the worst-case number of rounds taken by SleepingMISRecursive(). Then
since SleepingMISRecursive() calls SleepingMISRecursive() (at most) two times and takes 3 extra rounds. The base case is . The above recurrence gives
As an immediate corollary to Lemma 10, we have
The node-averaged round complexity of Algorithm 1 is .
There is a randomized Monte Carlo,666We recall that a Monte Carlo randomized algorithm is one that may sometimes produce an incorrect solution. In contrast, Las Vegas algorithms are randomized algorithms that always produce the correct solution. Please refer to [23, Section ] for a detailed discussion on these two classes of randomized algorithms. distributed MIS algorithm — described in Algorithm 1 — that is correct with high probability and has the following performance measures:
node-averaged awake complexity (on expectation),
worst-case awake complexity (always),
node-averaged round complexity (always), and
worst-case round complexity (always).
4.4 Improving the Worst-case Round Complexity
We now show how to reduce the worst-case round complexity significantly to polylogarithmic rounds while still keeping the node-averaged awake complexity to be constant and the worst-case awake complexity to be rounds. In particular, we show the following theorem, which is the main result of this section.
There is a randomized, Monte-Carlo distributed MIS algorithm — described in Algorithm 2 — that is correct with high probability and has the following performance measures:
node-averaged awake complexity (on expectation),
worst-case awake complexity (with high probability),
node-averaged round complexity (with high probability), and
worst-case round complexity (with high probability).
In the rest of the section, we prove Theorem 2.
Modifications to Algorithm SleepingMISRecursive().
The main idea is to truncate the recursion tree of Algorithm SleepingMISRecursive() earlier (see Figure 2) and use the parallel/distributed randomized greedy Maximal Independent Set (MIS) algorithm (described below) to solve the base cases.
The parallel/distributed randomized greedy MIS algorithm works as follows. An order (also called ranking) of the vertices is chosen uniformly at random. Then, in each round, all vertices having the highest rank among their respective neighbors are added to the independent set and removed from the graph along with their neighbors. This process continues iteratively until the resulting graph is empty.
The parallel/distributed randomized greedy MIS algorithm was first introduced by Coppersmith et al. . They used this algorithm to find an MIS for random graphs and showed that it runs in expected rounds and that this holds true for all values of . Blelloch et al.  extended this result to general (arbitrary) graphs and showed an run-time with high probability. Finally, Fischer and Noever  improved the analysis further and showed that the parallel/distributed randomized greedy MIS algorithm ran in rounds with high probability and also that this bound was tight. Fischer and Noever’s work showed that the parallel/distributed randomized greedy MIS algorithm was (asymptotically) as fast as the famous algorithm by Luby [20, 2], where the random ranking of the nodes is chosen afresh for each iteration.
It is a well-known fact (see, e.g., ) that the parallel/distributed randomized greedy MIS algorithm always produces what is known as the lexicographically first MIS , i.e., the same MIS output by the sequential greedy algorithm. In other words, it always produces the same result once an ordering of the vertices is fixed. We observe that Algorithm SleepingMISRecursive() produces a lexicographically first MIS as well. This follows as an immediate corollary to Lemma 4. Formally,
Algorithm SleepingMISRecursive() and the parallel/distributed randomized greedy MIS algorithm produce the same MIS.
This means that at any given level , the recursion tree corresponding to Algorithm SleepingMISRecursive() (see Figure 2) would look exactly the same — no matter which of Algorithm SleepingMISRecursive() or the parallel/distributed randomized greedy MIS algorithm was used to solve the subsequently lower levels. In particular, Lemma 7 would still hold if the level (for any ) of the recursion tree was considered as the base case (instead of the level as in Algorithm SleepingMISRecursive()) and the parallel/distributed randomized greedy MIS algorithm was used to solve each of the base cases. We note that for synchronization at higher levels of the recursion, we will require that the greedy algorithm runs for (exactly) rounds for some large (but fixed) constant . The constant is chosen such that the greedy algorithm finishes with high probability and the existence of this constant follows from the analysis of Fischer and Noever . However, this makes the algorithm Monte-Carlo, since there is a small probability that some base case might not finish within the required time which might affect the correctness of the overall algorithm.
We are now ready to present our modified algorithm which we call Fast-SleepingMISRecursive(). The recursion tree for this algorithm is shown in Figure 2. Note that we slightly modify the parallel/distributed randomized greedy MIS algorithm as follows: