Fully Dynamic Maximal Independent Set with Sublinear in n Update Time

06/26/2018 ∙ by Sepehr Assadi, et al. ∙ ibm University of Pennsylvania 0

The first fully dynamic algorithm for maintaining a maximal independent set (MIS) with update time that is sublinear in the number of edges was presented recently by the authors of this paper [Assadi et.al. STOC'18]. The algorithm is deterministic and its update time is O(m^3/4), where m is the (dynamically changing) number of edges. Subsequently, Gupta and Khan and independently Du and Zhang [arXiv, April 2018] presented deterministic algorithms for dynamic MIS with update times of O(m^2/3) and O(m^2/3√( m)), respectively. Du and Zhang also gave a randomized algorithm with update time O(√(m)). Moreover, they provided some partial (conditional) hardness results hinting that update time of m^1/2-ϵ, and in particular n^1-ϵ for n-vertex dense graphs, is a natural barrier for this problem for any constant ϵ >0, for both deterministic and randomized algorithms that satisfy a certain natural property. In this paper, we break this natural barrier and present the first fully dynamic (randomized) algorithm for maintaining an MIS with update time that is always sublinear in the number of vertices, namely, an O(√(n)) expected amortized update time algorithm. We also show that a simpler variant of our algorithm can already achieve an O(m^1/3) expected amortized update time, which results in an improved performance over our O(√(n)) update time algorithm for sufficiently sparse graphs, and breaks the m^1/2 barrier of Du and Zhang for all values of m.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The maximal independent set (MIS) problem is of utmost practical and theoretical importance, primarily since MIS algorithms provide a useful subroutine for locally breaking symmetry between multiple choices. MIS is often used in the context of graph coloring, as all vertices in an independent set can be assigned the same color. As another example, Hopcroft and Karp [10]

gave an algorithm to compute a large bipartite matching (approximating the maximum matching to within a factor arbitrarily close to 1) by finding maximal independent sets of longer and longer augmenting paths. In general, the MIS problem has natural connections to various important combinatorial optimization problems; see the celebrated papers of Luby 

[18] and Linial [17] for some of the most basic applications of MIS. Additional applications of MIS include leader election [6], resource allocation [24], network backbone constructions [14, 11], and sublinear-time approximation algorithms [21].

The MIS problem has been extensively studied in parallel and distributed settings, following the seminal works of [18, 2, 17]. Surprisingly however, the fundamental problem of maintaining an MIS in dynamic graphs received no attention in the literature until the pioneering PODC’16 paper of Censor-Hillel, Haramaty, and Karnin [5], who developed a randomized algorithm for this problem under the oblivious adversarial model666In the standard oblivious adversarial model (cf. [4], [12]), the adversary knows all the edges in the graph and their arrival order, as well as the algorithm to be used, but is not aware of the random bits used by the algorithm, and so cannot choose updates adaptively in response to the randomly guided choices of the algorithm. in distributed dynamic networks. Implementing the distributed algorithm of [5] in the sequential setting requires update time in expectation, where is a fixed upper bound on the maximum degree in the graph, which may be in sparse graphs. Furthermore, it is unclear whether time is also sufficient for this algorithm, and a naive implementation may incur an update time of , even in expectation, where is the (dynamically changing) number of edges; see Section 6 of [5] for further details.

We study the MIS problem in (sequential) dynamic setting, where the underlying graph evolves over time via edge updates. A dynamic graph is a graph sequence on fixed vertices, where the initial graph is and each graph is obtained from the previous graph in the sequence by either adding or deleting a single edge. The work of Censor-Hillel et al. [5] left the following question open: Can one dynamically maintain an MIS in time significantly lower than it takes to recompute it from scratch following every edge update?

The authors of this paper [3] answered this question in the affirmative, presenting the first fully dynamic algorithm for maintaining an MIS with (amortized) update time that is sublinear in the number of edges, namely, . Achieving an update time of is simple, and the main contribution of [3] is in further reducing the update time to . Note that improves over the simple bound only for sufficiently sparse graphs.

Onak et al. [22] studied “uniformly sparse” graphs, as opposed to the work by Assadi et al. [3] that focused on unrestricted sparse graphs. The “uniform sparsity” of the graph is often measured by its arboricity [19, 20, 23]: The arboricity of a graph is defined as , where . A dynamic graph of arboricity is a dynamic graph such that all graphs have arboricity bounded by . Onak et al. [22] showed that for any dynamic -vertex graph of arboricity , an MIS can be maintained with amortized update time , which reduces to in bounded arboricity graphs, such as planar graphs and more generally all minor-closed graph classes. The result of [22] improves that of [3] for all graphs with arboricity bounded by , for any constant . Since the arboricity of a general graph cannot exceed , this result covers much of the range of possible values for arboricity. Nonetheless, for general graphs, this update time of is in fact higher than the naive time needed to compute an MIS from scratch.

Recently, the bound of Assadi et al. [3] for general graphs was improved to by Gupta and Khan [8] and independently to by Du and Zhang [7]. All the aforementioned algorithms (besides the distributed algorithm of [5]) are deterministic. Du and Zhang also presented a randomized algorithm under the oblivious adversarial model with an expected update time of ; for dense graphs, this update time reduces to which is worse than the simple deterministic update time algorithm for this problem.

None of the known algorithms for dynamically maintaining an MIS achieves an update time of in dense graphs. A recent result of Du and Zhang [7] partially addresses this lack of progress: they presented an “imperfect reduction” from the

Online Boolean Matrix-Vector Multiplication

problem to prove a conditional hardness result for the dynamic MIS problem (see, e.g. [9] for the role of this problem in proving conditional hardness result for dynamic problems). This result hints that the update time of or for any constant maybe of a natural barrier for a large class of deterministic and randomized algorithms for dynamic MIS that satisfy a certain natural property (see [7] for exact definition of this property and more details).

This state-of-affairs, namely, the lack of progress on obtaining the update time of for dynamic MIS in general on one hand, and the partial hardness result hinting that (essentially) update time might be a natural barrier for this problem for a large class (but not all) of algorithms on the other hand, raises the following fundamental question:

Question 1.

Can one maintain a maximal independent set in a dynamically changing graph with update time that is always ?

1.1 Our contribution

Our main result is a positive resolution of Question 1 in a strong sense:

[backgroundcolor=lightgray!40,topline=false,rightline=false,leftline=false,bottomline=false,innertopmargin=2pt]

Theorem 1.

Starting from an empty graph on fixed vertices, an MIS can be maintained over any sequence of edge insertions and deletions in amortized update time, where

denotes the dynamic number of edges, and the update time bound holds both in expectation and with high probability

777We remark that the high probability guarantee holds when the number of updates is sufficiently large; see the formal statement of the results in later sections..

The proof of Theorem 1 is carried out in three stages. In the first stage we provide a simple randomized algorithm for maintaining an MIS with update time ; although we view this as a “warmup” result, it already resolves Question 1. In the second stage we generalize this simple algorithm to obtain an update time of . Achieving the bound is more intricate; we reach this goal by carefully building on the ideas from the and -time algorithms.

Finding a maximal independent set is one of the most studied problems in distributed computing. It is thus important to provide an efficient distributed implementation of the proposed sequential dynamic algorithms. While the underlying distributed network is subject to topological updates (particularly edge updates) as in the sequential setting, the goal in the distributed setting is quite different: Optimizing the (amortized) round complexity, adjustment complexity and message complexity of the distributed algorithm (see, e.g. [5, 3] for definitions). Achieving low amortized round and adjustment complexities is typically rather simple, and so the goal is to devise a distributed algorithm whose amortized message complexity matches the update time of the proposed sequential algorithm. This goal was achieved by [3] and [8]. Similarly to [3, 8], our sequential algorithm can also be distributed, achieving an expected amortized message complexity of , in addition to amortized round and adjustment complexities, per each update. We omit the details of the distributed implementation of our algorithm as it follows more or less in a straightforward way from our sequential algorithm using the ideas in [3].

2 Preliminaries

Notation.

For a graph , denotes the number of vertices in and denotes the number of edges in . For set , we define as the induced subgraph of on vertices in . We further define to be the set of vertices that are neighbor to at least one vertex in in (we may drop the subscript when it is clear from the context). For a vertex , we define as the degree of in . Finally, denotes the maximum degree in .

Greedy MIS.

Maximal independent set problem admits a sequential greedy algorithm. Let be a graph and be any ordering of vertices in . iterates over vertices in according to the ordering and adds each vertex to the MIS iff none of its neighbors have already been chosen. It is immediate to verify that this algorithm indeed computes an MIS of for any ordering . Throughout this paper, we always assume that is the lexicographically-first ordering of vertices and hence simply write instead of .

2.1 A Deterministic -Update Time Algorithm

We use the following simple algorithm for maintaining an MIS deterministically: every vertex maintains a counter of number of its neighbors in the MIS, and after any update to the graph, decides whether it should join or leave the MIS based on this information. Moreover, any vertex that joins or leaves the MIS use time to update the counter of its neighbors. While the worst case update time of this algorithm can be quite large for some updates, one can easily prove that on average, only time is needed to process each update, as was first shown in [3] and further strengthened in [8].

Lemma 2.1 ([3, 8]).

Starting from any graph , a maximal independent set can be maintained deterministically over any sequence of vertex or edge insertions and deletions in time where is a fixed bound on the maximum degree in the graph and is the original number of edges in .

2.2 Sample-and-Prune Technique for Computing an MIS

We also use a simple application of the sample-and-prune technique of [15] (see also [16]) originally introduced in context of streaming and MapReduce algorithms. To our knowledge, the following lemma was first proved in [13] following an approach in [1]. Intuitively speaking, it asserts that if we sample each vertex of the graph with probability , compute an MIS of the sampled graph, and remove all vertices that are incident to this MIS, the degree of remaining vertices would be . For completeness, we present a self-contained proof of this lemma here (we note that our formulation is somewhat different from that of [13] and is tailored to our application).

Lemma 2.2 (cf. [13, 1]).

Fix any -vertex graph and a parameter . Let be a collection of vertices chosen by picking each vertex in independently and with probability . Suppose and . Then, with probability ,

Proof.

Define and fix any vertex in the original graph . We prove that with high probability either or and then take a union bound on all vertices to conclude the proof.

We note that the process of computing can be seen as iterating over vertices of in a lexicographically-first order and skip the vertex if it is incident on (computed so far) and otherwise pick it with probability and include it in . Let be the neighbors of in ordered accordingly. When processing the vertex , if is not already incident on computed so far, the probability that we pick to join is exactly . As such, if we encounter at least such vertices in this process, the probability that we do not pick any of them is at most:

As such, we either did not encounter vertices not incident to , which implies that , or we did, which implies that with probability , itself is neighbor to some vertex in (as by calculation above, we would pick one of those at least vertices) and hence does not belong to . Taking a union bound on all vertices now finalizes the proof.       

3 Warmup: An -Update Time Algorithm

We shall start with a simpler version of our algorithm as a warm-up.

Theorem 2.

Starting from an empty graph on vertices, a maximal independent set can be maintained via a randomized algorithm over any sequence of edge insertions and deletions in time in expectation and time with high probability888This in particular implies that when the length of update sequence is , the amortized update time is with high probability ..

The algorithm in Theorem 2 works in phases. Each phase starts with a preprocessing step in which we initiate the data structure for the algorithm and in particular compute a partial MIS of the underlying graph with some useful properties (to be specified later). Next, during each phase, we have the update step which processes the updates to the graph until a certain condition (to be defined later) is met, upon which we terminate this phase and start the next one. We now introduce each step of our algorithm during one phase.

The Preprocessing Step

The goal in this step is to find a partial MIS of the current graph with the following (informal) properties: it should be “hard” for a non-adaptive oblivious adversary to “touch” vertices of this independent set, and maintaining an MIS in the reminder of the graph, i.e., after excluding these vertices and their neighbors from consideration, should be distinctly “easier”.

In the following, we prove that the sample-and-prune technique introduced in Section 2 can be used to achieve this task (we will pick an exact value for below later but approximately ):

[ enlarge top by=5pt, enlarge bottom by=5pt, breakable, boxsep=0pt, left=4pt, right=4pt, top=10pt, arc=0pt, boxrule=1pt,toprule=1pt, colback=white ] :

  1. Let be a set chosen by picking each vertex in with probability independently.

  2. Compute .

  3. Return .

Throughout this section, we use to denote the time step in which is computed (hence ). We define a partitioning of the vertices of at any time :

  • : the set of vertices computed by (and not ).

  • : the set of vertices incident on in the graph that are not in .

  • : the set of vertices not in neither incident to in the graph .

It is easy to see that in any time , partitions the vertices of the graph. We emphasize that definition of is with respect to the time step and graph , while and are defined for the graph for . This means that across time steps , the set of vertices is fixed but remaining vertices may move between and . We use this partitioning to define the following key time steps in the execution of the algorithm:

  • : the first time step in which (recall that and were computed with respect to and not ).

  • : the first time step in which the total number of times (since ) that vertices have moved from to , for , reaches .

  • : the first time step in which .

  • where : the time step in which we terminate this phase (in other words, if any of the conditions above happen, the phase finishes and the next phase starts).

By definition above, each phase starts at time step and ends at time step and has length at most . We say that a phase is successful iff .

In the following, we prove that every phase is successful with at least a constant probability (this fact will be used later to argue that the cost of preprocessing steps can be amortized over the large number of updates between them).

Lemma 3.1.

Any given phase is successful, i.e., has , with probability at least .

Proof.

The lemma is proved in the following three claims which bound , , and , respectively. All claims crucially use the fact the adversary is non-adaptive and oblivious and hence we can fix its updates beforehand.

Claim 3.2.

.

Proof.

For any , let denote the edge updated by the adversary at time . We consider the randomness in . The probability that both and belong to is exactly . For any

, define an indicator random variable

which is iff belongs to . Let . In order for to no longer be equal to for some , at least one of these updates needs to have both endpoints in . As such,

where the second inequality is by Markov bound.       

Claim 3.3.

.

Proof.

For any , let denote the edge updated by the adversary at time . By the randomness in , the probability that at least one endpoint of belong to is . For any , define an indicator random variable which is iff at least one of or belong to . Let .

The only way a vertex from moves to is that an edge incident on this vertex with other endpoint in is deleted (and this vertex has no other edge to either). For this to happen times (as in definition of ), we need to have at least updates in the range with at least one endpoint in (recall that ). As such,

where the second inequality is by Markov bound.       

Claim 3.4.

.

Proof.

Fix the graphs for . Recall that is a subset of vertices of each chosen independently with probability . Moreover, since and hence , we know that is indeed equal to (in addition to ). As such, by Lemma 2.2, with choice of and , for any graph , with probability , we have that . Taking a union bound on these graphs finalizes the proof.       

By applying union bound to Claims 3.23.3, and 3.4, the probability that is at most , finalizing the proof of Lemma 3.1.       

We conclude this section with the following straightforward lemma.

Lemma 3.5.

takes time where .

The Update Algorithm

We now describe the update process during each phase. As argued before, each phase continues between time steps and where the latter is smaller than or equal to time steps and . As such, by definition of these time steps, we have the following invariant.

[hidealllines=false,backgroundcolor=gray!10,innertopmargin=0pt]

Invariant 1.

At any time step inside one phase:

  1. [label=()]

  2. is an MIS of the graph ,

  3. .

Moreover, throughout the phase, at most vertices are moved from to .

We note that the first property above is simply because as and hence is also an MIS of . The second property is by definition of and the last one is by definition of .

Our update algorithm simply maintains the graph at all time and run the basic deterministic algorithm in Lemma 2.1 on to maintain an MIS of . The full MIS maintained by the dynamic algorithm is then .

We now describe the update algorithm in more details. For any vertex , we maintain whether it currently belongs to , , or . Additionally, for any vertex in , we maintain a list of its neighbors in . Finally, we also maintain the graph , which involves storing, for each vertex , the set of all of its neighbors in . Note that both edges and vertices (as opposed to only edges) may be inserted to or deleted from by the algorithm (and as such, we crucially use the fact that the algorithm in Lemma 2.1 can process vertex-updates as well). Fix a time and let be the updated edge. We consider the following cases:

  • [leftmargin=10pt]

  • Case 1. Updates that cannot impact the partitioning of vertices:

    • Case 1-a. Both and belong to . This update means that as the graph is updated and hence this update concludes this phase (and is processed in the next phase).

    • Case 1-b. Both and belong to . There is nothing to do in this case.

    • Case 1-c. Both and belong to . We need to update the edge in the graph and pass this edge-update to the algorithm in Lemma 2.1 on .

    • Case 1-d. belongs to and belongs to (or vice versa). There is nothing to do in this case.

  • Case 2. Updates that can (potentially) change the partitioning of vertices:

    • Case 2-a. is in and is in (or vice versa). If is inserted, the partitioning remains the same and there is nothing to do except for updating the list of neighbors of in . However, if is deleted, it might be that needs to be removed from and inserted to instead (if it is no longer incident on ). If so, we iterate over all neighbors of and find the ones which are in . We then insert with all its incident edges to and pass this vertex-update to the algorithm in Lemma 2.1 on .

    • Case 2-b. is in and is in (or vice versa). If is deleted, the partitioning remains the same and there is nothing to do. However, if is inserted, it might be that needs to leave and join (if belongs to ). If so, we delete with all its incident edges in from and run the algorithm in Lemma 2.1 to process this vertex-update in .

The cases above cover all possible updates. By the correctness of the deterministic algorithm in Lemma 2.1, is a valid MIS of . Since all vertices in are incident to some vertex in , it is immediate to verify that is an MIS of the graph for any time step by Invariant 1. It only remains to analyze the running time of the update algorithm.

Lemma 3.6.

Let denote the number of updates in a particular phase. The update algorithm maintains an MIS of the graph in this phase in time.

Proof.

The cost of bookkeeping the data structures in the update algorithm is per each update. The two main time consuming steps are hence maintaining an MIS in the graph and maintaining the graph itself.

The former task, by Lemma 2.1, requires time in total where , which by Invariant 1 is . Hence, this part takes time in total.

For the latter task, performing edge updates (in Case 1-c) can be done with time per each update. Making vertex-deletion updates (in Case 2-b) can also be done in time per update as we only need to iterate over neighbors of the updated vertex in . However, performing the vertex-insertion updates (in Case 2-a) requires iterating over all neighbors of the inserted vertex (in not only ) and hence takes time. Nevertheless, by Invariant 1, the total number of such vertex-updates is and hence their total running time is .       

Proof of Theorem 2

We are now ready to prove Theorem 2. The correctness of the algorithm immediately follows from Lemma 3.6, hence, it only remains to bound the amortized update time of the algorithm.

Fix a sequence of updates, and let denote the different phases of the algorithm over this sequence (i.e., each corresponds to the updates inside one phase). The time spent by the overall algorithm in each phase is in the preprocessing step (by Lemma 3.5), and (by Lemma 3.6). As such, the total running time is (since ). So to finalize the proof, we only need to bound the number of phases, which is done in the following two lemmas.

Lemma 3.7.

(the randomness is taken over the coin tosses of the PreProcess).

Proof.

Recall that a phase is called successful iff . The probability that any phase is successful is at least by Lemma 3.1. Moreover, since the randomness of PreProcess is independent between any two phases, the event that is successful is independent of all previous phases (unless there are no updates left in which case this is going to be the last phase).

Notice that any successful phase includes updates and hence we can have at most long phases (even if we assume short phases include no updates). Consider the following randomized process: we have a coin which has at least chance of tossing head; how many times in expectation do we need to toss this coin (independently) to see heads? It is immediate to verify that is at most this number. It is also standard fact that the expected number of coin tosses in this process is . Hence .       

By Lemma 3.7, the expected running time of the algorithm is . By picking , we obtain the expected running time of the algorithm is time, proving the bound on expected amortized update time in Theorem 2.

We now prove the high probability bound on the running time.

Lemma 3.8.

With probability , (the randomness is taken over the coin tosses of the PreProcess).

Proof.

Recall the coin tossing process described in the proof of Lemma 3.7. Consider the event that among the first coin tosses, there are at most heads. The probability of this event is at most by a simple application of Chernoff bound. On the other hand, the probability of this event is at least equal to the probability that among the first phases of the algorithm, there are at most long phases. This concludes the proof of first part as we cannot have more than long phases among updates (each long phase “consumes” updates).       

By the choice of , if , then by Lemma 3.8, the running time of the algorithm is , finalizing the proof of this part also.

If however , we only need one successful phase to process all the updates. In this case, since every phase is successful with constant probability, with high probability we only need to consider phases before we are done. Moreover, note that when the number of updates is at most , the total number of edges in the graph is also only and the preprocessing time takes per each phase as opposed to . This means that the total running time in this case is at most (for preprocessing) plus (time spent inside the phases). This concludes the proof of Theorem 2.

4 An Improved -Update Time Algorithm

We now show that one can alter the algorithm in Theorem 2 to obtain improved performance for sparser graphs. Formally,

Theorem 3.

Starting from an empty graph, a maximal independent set can be maintained via a randomized algorithm over any sequence of edge insertions and deletions in amortized update time both in expectation and with high probability, where denotes the dynamic number of edges.

The following lemma is a somewhat weaker looking version of Theorem 3. However, we prove next that this lemma is all we need to prove Theorem 3.

Lemma 4.1.

Starting with any arbitrary graph on edges, a maximal independent set can be maintained via a randomized algorithm over any sequence of edge insertions and deletions in time in expectation and with high probability, as long as the number of edges in the graph remains within a factor of .

We first prove that this lemma implies Theorem 3. The proof of this part is standard (see, e.g. [3]) and is only provided for completeness.

Proof of Theorem 3.

For simplicity, we define in case of empty graphs. The idea is to run the algorithm in Lemma 4.1 until the number of edges deviate from by a factor more than , upon which, we terminate the algorithm and restart the process. As the total number of updates is , we can apply Lemma 4.1 and obtain a bound of on the expected amortized update time. Moreover, we can “charge” the time needed to restart the process to the updates happening in this phase and obtain the final bound.       

The rest of this section is devoted to the proof of Lemma 4.1. The algorithm in Lemma 4.1 is similar to the one in Theorem 2 and in particular again executes multiple phases each starting by the same preprocessing step (although with change of parameters) followed by the update algorithm throughout the phase. We now describe the preprocessing step and the update algorithm inside each phase. Recall that throughout this proof, denotes a -approximation to the number of edges in the graph.

The Preprocessing Step

Let again denote the first time step in this phase. The preprocessing step of the new algorithm is exactly as before by running for (this value of is different from the one in Section 3 which was ). We define the partitioning of vertices as before. However, we change the stopping criteria of the phase and definition of time steps as follows :

  • : the first time step in which (recall that and were computed with respect to and not ).

  • : the first time step in which the total number of times (since ) that vertices have moved from to , for , reaches .

  • : the first time step in which .

  • where : the time step in which we terminate this phase.

We again say that a phase is successful if , i.e., we process updates in the phase before terminating. Similar to Lemma 3.1, we prove that each phase is successful with at least a constant probability.

Lemma 4.2.

Any given phase is successful with probability at least .

Proof.

The proof is quite similar to Lemma 3.1 and is based on the fact that the adversary is non-adaptive and oblivious.

Claim 4.3.

.

Proof.

The proof is identical to Claim 3.2 by substituting the new values of and .       

Claim 4.4.

.

Proof.

Again, the proof is identical to Claim 3.3 by substituting the new values of and .       

Claim 4.5.

.

Proof.

Fix the graphs for and note that has at most vertices with non-zero degree (as number of edges in is at most ) and we can ignore vertices with degree zero as they will not affect the following calculation. By Lemma 2.2, with choice of and for any graph (with at most vertices), with probability , . Taking a union bound on these graphs finalizes the proof.       

By applying union bound to Claims 4.34.4, and 4.5, the probability that is at most , finalizing the proof of Lemma 4.2.       

We conclude this section by noting by Lemma 3.5, the preprocessing step of this algorithm takes time. However, a simple trick can reduce the running time to only as follows.

Lemma 4.6.

The preprocessing step of the new algorithm can be implemented in time.

Proof.

Initially, there are at most vertices in the preprocessing step that have non-zero degree. Hence, instead of picking the set from all of , we only pick it from the vertices with non-zero degree, which can be done in time. Later in the algorithm, whenever a new vertex is given an edge in this phase, we toss a coin and decide to add to with probability which can be done in time. We then process this update as before as if this new vertex always belonged to . It is immediate to verify that this does not change any part of the algorithm.       

The Update Algorithm

We now describe the new update algorithm. Firstly, similar to Invariant 1 in the previous section, here also by definition of each phase, we have that,

[hidealllines=false,backgroundcolor=gray!10,innertopmargin=0pt]

Invariant 2.

At any time step inside one phase:

  1. [label=()]

  2. is an MIS of the graph ,

  3. .

Moreover, throughout the phase, at most vertices are moved from to .

The update algorithm is similar to the one in previous section: we maintain the graph and use the algorithm in Lemma 2.1 to maintain an MIS in . The main difference is in how we maintain the graph (the rest is exactly as before). In order to do this, we present a simple data structure.

The Data Structure.

As before, we maintain the list of all neighbors of each vertex, as well as the set , , or that it belongs to for each vertex. Clearly, this information can be updated in time per each update. In addition to the partition , we also partition vertices based on their degree in the original graph at the beginning of the phase, i.e., in . Specifically, we define to be the set of vertices with degree at least in and to be the remaining vertices. Note that this partitioning is defined with respect to the graph and does not change throughout the phase. We have the following simple claim.

Claim 4.7.

Throughout one phase:

  1. .

  2. For any vertex and any graph for , degree of in is .

Proof.

The first is simply because each vertex in has degree at least and the total number of edges is at most . The second part is because the total number of updates inside a phase is at most by the definition of and hence even if they are all incident on a vertex in , the degree of the vertex is at most , finalizing the proof.       

Finally, for any vertex , we maintain a list of all of its neighbors in as follows: whenever a vertex moves between and , it iterates over all vertices in and inform them of this update. This way, vertices in are always aware of their neighborhood in . The remaining vertices also have a relatively small degree and hence whenever needed, we could simply iterate over all their neighbors and find the ones in . As a result of this, we have the following invariant.

[hidealllines=false,backgroundcolor=gray!10,innertopmargin=0pt]

Invariant 3.

At any time step inside one phase after updating :

  1. [label=()]

  2. We can find the list of all neighbors of and that belong to in time.

  3. Updating the data structure after the update takes time.

Proof.

For vertices in , we have maintained the list of their neighbors explicitly and hence we can directly return this list. For vertices in , we can simply iterate over their neighbors (by Claim 4.7) and check which one belongs to and create the list in time. Finally, the update time is as there are only vertices in (by Claim 4.7) and each vertex is only updating these vertices per update.       

Processing Each Update.

We process each update exactly as in the previous section, with the difference that we use Invariant 3, for maintaining the graph . To be more specific, in Case 2-a, where a vertex may be inserted in , we use the list in Invariant 3, to find all neighbors of this vertex in and then pass this vertex-update to the algorithm of Lemma 2.1 on . The remaining cases are handled exactly as before.

The correctness of the algorithm follows as before and we only analyze the running time of the update algorithm.

Lemma 4.8.

Fix any phase and let denote the number of updates inside this phase. The update algorithm maintains an MIS of the input graph (deterministically) in time.

Proof.

By Invariant 3, updating the data structure takes time. Maintaining the MIS in the graph also requires time by Lemma 2.1. Finally, by Invariant 3, we can find the neighbors of any updated vertex in in time. Since, the total number of times we need to find these neighbors is by Invariant 2 (as we only need this operation when a vertex moves from to ), the total time needed for this part is also , finalizing the proof.       

Proof of Lemma 4.1

The correctness of the algorithm immediately follows from Lemma 4.8, hence, it only remains to bound the amortized update time of the algorithm. Fix a sequence of updates, and let denote the different phases of the algorithm over this sequence (i.e., each corresponds to the updates inside one phase). The time spent by the overall algorithm in each phase is in the preprocessing step (by Lemma 4.6), and (by Lemma 4.8). As such, the total running time is (since ). So to finalize the proof, we only need to bound the number of phases, which we do in the following lemma.

Lemma 4.9.

(the randomness is taken over the coin tosses of the PreProcess).

Proof.

Recall that a phase is called successful iff . The probability that any phase is successful is at least by Lemma 4.2. Moreover, since the randomness of PreProcess is independent between any two phases, the event that is successful is independent of all previous phases (unless there are no updates left in which case this is going to be the last phase).

Notice that any successful phase includes updates and hence we can have at most successful phases (even if we assume the other phases include no updates). Consider the following randomized process: we have a coin which has at least chance of tossing head; how many times in expectation do we need to toss this coin (independently) to see heads? It is immediate to verify that is at most this number. It is also standard fact that the expected number of coin tosses in this process is . Hence .       

By Lemma 4.9, the expected running time of the algorithm is , concluding the proof of expectation-bound in Lemma 4.1. The extension to the high probability result now is exactly the same as in Lemma 3.8, as . This concludes the proof of Lemma 4.1.

5 Main Algorithm: An -Update Time Algorithm

We now present our main algorithm for maintaining an MIS in a dynamic graph with expected amortized update time.

Theorem 4.

Starting from an empty graph on vertices, a maximal independent set can be maintained via a randomized algorithm over any sequence of edge insertions and deletions in time in expectation and in time with high probability.

The improvement in Theorem 4 over our previous algorithm in Theorem 2 is obtained by using a nested collection of phases instead of just one phase. Let . We maintain subgraphs of the input graph at any time step of the algorithm, referred to as level graphs. For any level , we compute and maintain the subgraph at level in a level- phase. A phase as before consists of a preprocessing step, followed by update steps during the phase, and a termination criteria for the phase. Moreover, the phases across different levels are nested in a way that a level-1 phase consists of multiple level-2 phases, a level-2 phase contain multiple level-3 phases and so on. We now describe our algorithm in more details starting with the nested family of level graphs.

Level Graphs

Our approach is based on computing and maintaining a collection of graphs , referred to as level graphs, which are subgraphs of and a collection of independent sets . We maintain the following main invariant in our algorithm (we prove different parts of this invariant in this and the next two sections).

[hidealllines=false,backgroundcolor=gray!10,innertopmargin=0pt]

Invariant 4 (Main Invariant).

At any time step and for any :

  1. is a maximal independent set of .

  2. (for parameters to be determined later).

  3. is maintained explicitly by the algorithm with an adjacency-list access for every vertex.

We start by defining the three main collections of vertices of , , and used in our algorithm (when clear from the context, or irrelevant, we may drop the subscript from these sets). For simplicity of notation, we also define and for all . We design these sets carefully in the next section to satisfy the properties below.

Proposition 5.1.

At any time step :

  1. The sets in , i.e., , are all pairwise disjoint.

  2. The sets in are nested, i.e., .

  3. For any fixed , and partition .

For any , the level- graph is defined as the induced subgraph of on , i.e., . Moreover, would be chosen carefully from the graph such that . would also be an MIS of the graph . We further have,

Proposition 5.2.

At any time step :

  1. For any , the independent set is an MIS of .

  2. For any , is incident to some vertex of and has no neighbor in .

Before we move on from this section, we show that Proposition 5.1 and 5.2 imply the Part-(1) of Invariant 4.

Proof of Part-(