# Adapting Local Sequential Algorithms to the Distributed Setting

It is a well known fact that sequential algorithms which exhibit a strong "local" nature can be adapted to the distributed setting given a legal graph coloring. The running time of the distributed algorithm will then be at least the number of colors. Surprisingly, this well known idea was never formally stated as a unified framework. In this paper we aim to define a robust family of local sequential algorithms which can be easily adapted to the distributed setting. We then develop new tools to further enhance these algorithms, achieving state of the art results for fundamental problems. We define a simple class of greedy-like algorithms which we call orderless-local algorithms. We show that given a legal c-coloring of the graph, every algorithm in this family can be converted into a distributed algorithm running in O(c) communication rounds in the CONGEST model. We show that this family is indeed robust as both the method of conditional expectations and the unconstrained submodular maximization algorithm of Buchbinder BuchbinderFNS15 can be expressed as orderless-local algorithms for local utility functions --- Utility functions which have a strong local nature to them. We use the above algorithms as a base for new distributed approximation algorithms for the weighted variants of some fundamental problems: Max k-Cut, Max-DiCut, Max 2-SAT and correlation clustering. We develop algorithms which have the same approximation guarantees as their sequential counterparts, up to a constant additive ϵ factor, while achieving an O(^* n) running time for deterministic algorithms and O(ϵ^-1) running time for randomized ones. This improves exponentially upon the currently best known algorithms.

## Authors

• 28 publications
• 12 publications
• ### Adapting Sequential Algorithms to the Distributed Setting

In this paper we aim to define a robust family of sequential algorithms ...
11/28/2017 ∙ by Gregory Schwartzman, et al. ∙ 0

• ### Deterministic coloring algorithms in the LOCAL model

We study the problem of bi-chromatic coloring of hypergraphs in the LOCA...
07/30/2019 ∙ by Dariusz R. Kowalski, et al. ∙ 0

• ### The Curse and Blessing of Not-All-Equal in k-Satisfiability

We study upper bounds for the running time of algorithms for NAE-k-SAT a...
09/12/2018 ∙ by Sixue Liu, et al. ∙ 0

• ### On Derandomizing Local Distributed Algorithms

The gap between the known randomized and deterministic local distributed...
11/06/2017 ∙ by Mohsen Ghaffari, et al. ∙ 0

• ### Analysis of Two-variable Recurrence Relations with Application to Parameterized Approximations

In this paper we introduce randomized branching as a tool for parameteri...
11/06/2019 ∙ by Ariel Kulik, et al. ∙ 0

• ### Sticky Brownian Rounding and its Applications to Constraint Satisfaction Problems

Semidefinite programming is a powerful tool in the design and analysis o...
12/19/2018 ∙ by Sepehr Abbasi-Zadeh, et al. ∙ 0

• ### Distributed Weighted Matching via Randomized Composable Coresets

Maximum weight matching is one of the most fundamental combinatorial opt...
06/05/2019 ∙ by Sepehr Assadi, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

A large part of research in the distributed environment aims to develop fast distributed algorithms for problems which have already been studied in the sequential setting. Ideally, we would like to use the power of the distributed environment to achieve a substantial improvement in the running time over the sequential algorithm, and indeed, for many problems distributed algorithms achieve an exponential improvement over the sequential case. One approach to designing distributed algorithms is using the sequential algorithm as natural staring point [11, 6, 5, 17, 7], then certain adjustments are made for the distributed environment in order to achieve a faster running time.

There is a well known folklore in distributed computing, which roughly says that if a sequential graph algorithm works by traversing nodes in any order (perhaps adversarial), and for every node makes a local decision, then given a legal -coloring of the graph, the algorithm can be adapted to the distributed setting by going over all color classes, and for each executing all nodes in the class simultaneously. Surprisingly, there is no formal framework describing the above. In this paper we provide such a framework for a specific class of algorithms (defined later).

We note that for general graphs a legal coloring may require at least colors, where is the maximal degree of the graph. Using the above framework we aim to answer the following question: Are there certain classes of algorithms where using the above can result in a running time sublinear in ? We show that for certain approximation problems the answer is quite surprising, as we are able to achieve an almost constant running time!

More precisely, we show that for the problems of Max -Cut, Max-DiCut, Max 2-SAT and correlation clustering we can adapt the sequential algorithm to these problems in such a way that the running time is rounds for deterministic algorithms and for randomized ones, while losing only an additive -factor in the approximation ratio. For the problems of Max-Cut and Max-DiCut this greatly improves upon the previous best known results, which required a number of rounds linear in . A summary of our results appears in Table 1.

### 1.1 Tools and results

In this paper we focus our attention on approximation algorithms for unconstrained optimization problems on graphs. We are given some graph , where each vertex is assigned a variable taking values in some set . We aim to maximize some utility function over these variables (For a formal definition see Section 2). Our distributed model is the CONGEST model of distributed computation, where the network is represented by a graph, s.t nodes are computational units and edges are communication links. Nodes communicate in synchronous communication rounds, where at each round a node sends and receives messages from all of its neighbors. In the CONGEST model the size of messages sent between nodes is limited to bits, where . This is more restrictive than the LOCAL model, where message size is unbounded. Our complexity measure is the number of communication rounds of the algorithm.

Adapting a sequential algorithm of the type we describe above to the distributed setting, means we wish each node in the communication graph to output an assignment to such that the approximation guarantee is close to that of the sequential algorithm, while minimizing the number of communication rounds of the distributed algorithm. Our goal is to formally define a family of sequential algorithms which can be easily converted to distributed algorithms, and then develop tools to allow these algorithms to run exponentially faster, while achieving almost the same approximation ratio. To achieve this we focus our attention on a family of sequential algorithms which exhibit a very strong local nature.

We define a family of utility functions, which we call local utility functions (Formally defined in Section 2). We say that a utility function is a local utility function, if the change to the value of the function upon setting one variable can be computed locally. Intuitively, while optimizing a general utility function in the distributed setting might be difficult for global functions, the local nature of the family of local utility functions makes it a perfect candidate.

We focus on adapting a large family of, potentially randomized, local algorithms to the distributed setting. We consider orderless-local algorithms - algorithms that can traverse the variables in any order and in each iteration apply some local function to decide the value of the variable. By local we mean that the decision only depends on the local environment of the node in the graph, the variables of nodes adjacent to that variable and some randomness only used by that node. This is similar to the family of Priority algorithms first defined in [8]. The goal of [8] was to formally define the notion of a greedy algorithm, and then to explore the limits of these algorithms. Our definition is similar (and can be expressed as a special case of priority algorithms), but the goal is different. While [8] aims to prove lower bounds, we provide some sufficient conditions that allow us to easily transform local sequential algorithms into fast distributed algorithms.

Our definitions are also similar to the SLOCAL model [20], which also shows that sequential algorithms which traverse the graph vertices in any order and make local decisions can be adapted to the distributed LOCAL model in poly logarithmic rounds using randomization. While the results of [20] are much more broad, our transformation does not require any randomization and works in the CONGEST model. Finally, we should also mention the field of local computation algorithms [34] whose aim is developing efficient local sequential algorithms. We refer the reader to an excellent survey by Levi and Medina [29].

One might expect that due to the locality of this family of algorithms it can be distributed if the graph is provided with a legal coloring. The distributed algorithm goes over the color classes one after another and executes all nodes in the color class simultaneously. This solves any conflicts that may occur form executing two neighboring nodes, while the orderless property guarantees that this execution is valid. In a sense this argument was already used for specific algorithm (Coloring to MIS [31], MaxIS of [5], Max-Cut of [11]). We provide a more general result, using this classical argument. Specifically, we show that given a legal -coloring, any orderless-local algorithm can be distributed in -communication rounds in the CONGEST model.

To show that this definition is indeed robust, we show two general applications. The first is adapting the method of conditional expectations (Formally defined in Section 2) to the distributed setting. This method is inherently sequential, but we show that if the utility function optimized is a local utility function, then the algorithm is an orderless-local algorithm. A classical application of this technique is for Max -cut, where an -approximation is achieved when every node chooses a cut side at random. This can be derandomized using the method of conditional expectations, and adapted to the distributed setting, as the cut function is a local utility function. We note that the same exact approach results in a -approximation for max-agree correlation clustering on general graphs (see Section 2 for a definition). Because the tools used for Max-Cut directly translate to correlation clustering, we focus on Max-Cut for the rest of the paper, and only mention correlation clustering at the very end.

The second application is the unconstrained submodular maximization algorithms of [9], where a deterministic 1/3-approximation and a randomized expected 1/2-approximation algorithms are presented. We show that both are orderless-local algorithms when provided with a local utility function. This can be applied to the problem of Max-DiCut, as it is an unconstrained submodular function, and also a local utility function. The algorithms of [9] were already adapted to the distributed setting for the specific problem of Max-DiCut by [11] using similar ideas. The main benefit of our definition is the convenience and generality of adapting these algorithms without the need to consider their analysis or reprove correctness. We conclude that the family of orderless-local algorithms indeed contains robust algorithms for fundamental problems, and especially the method of conditional expectations.

At the time this paper was first made public, there was no distributed equivalent for the method of conditional expectations. We have since learned that, independently and simultaneously, an adaptation of the method of conditional expectations to the distributed setting was also presented in [19]. Their results show how the method of conditional expectations combined with a legal coloring can be used to convert any randomized LOCAL -round algorithm for a locally checkable problem to a deterministic one, running in .111They actually show that the running time is either or , achieving the latter via network decomposition. We focus on the first bound, as the second is less relevant for the comparison which follows. This is done via a transformation to an SLOCAL algorithm, where the derandomization is applied and then transforming back to a LOCAL algorithm.

Although not stated for the CONGEST model, we believe it to be the case that when their application of the method of conditional expectations works in the CONGEST, and is equivalent to our results. Another difference apart from the different model of communication, is that they focus on derandomizing locally checkable problems, while we focus on local utility functions. These two families of problems are different, as the approximation guaranteed for a certain local utility function need not be locally checkable. This last point highlights the different goal of the two papers. While [19] skillfully show that a large family of LOCAL algorithm can be derandomized, we aim to adapt sequential algorithm to the distributed setting while achieving as fast of a running time as possible in the more restrictive CONGEST model – hence we focus on local utility function which capture the locality of the optimization process.

Next, we wish to consider the running time of these algorithms. Recall that we expressed the running time of orderless local algorithms in terms of the colors of some legal coloring for the graph. For a general graph, we cannot hope for a legal coloring using less than , where is the maximum degree in the graph. This means that using the distributed version of an orderless-local algorithm unchanged will have a running time linear in . We show how to overcome this obstacle for Max -Cut and Max-DiCut. The general idea is to compute a defective coloring of the graph which uses few colors, drop all monochromatic edges, and call the algorithm for the new graph which now has a legal coloring.

A key tool in our algorithms is a new type of defective coloring we call a weighted -defective coloring. The classical defective coloring allows each vertex to have at most monochromatic edges, for some defect parameter . We consider positively edge weighted graphs and require a weighted fractional defect - for every vertex the total weight of monochromatic edges is at most an -fraction of the total weight of all edges of that vertex. We show that a weighted -defective coloring using colors can be computed deterministically in rounds using the defective coloring algorithm of [28]. The classical algorithm of Kuhn was found useful in the adaptation of sequential algorithms to the distributed setting [16, 20], thus its effectiveness for weighted -defective coloring might be of further use.

Although we cannot guarantee a legal coloring with a small number of colors for any graph , we may remove some subset of which will result in a new graph with a low chromatic number. We wish to do so while not decreasing the total sum of edge weights in , which we prove guarantees the approximation will only be mildly affected for our cut problems. Formally, we show that if we only decrease the total edge weight by an -fraction, we will incur an additive -loss in the approximation ratio of the cut algorithms for . For the randomized algorithm this is easy, simply color each vertex randomly with a color in and drop all monochromatic edges. For the deterministic case, we execute our weighted -defective coloring algorithm, and then remove all monochromatic edges. We then execute the relevant cut algorithm on the resulting graph which now has a legal coloring, using a small number of colors. The above results in extremely fast approximation algorithms for weighted Max -Cut and weighted Max-DiCut, while having almost the same approximation ratio as their sequential counterpart.

Finally, our techniques can also be applied to the problem of weighted Max 2-SAT. To do so we may use the randomized expected 3/4-approximation algorithm presented in [33]. It is based on the algorithm of [9], and thus is almost identical to the unconstrained submodular maximization algorithm. Because the techniques we use are very similar to the above, we defer the entire proof to the appendix.

### 1.2 Previous research

Cut problems: An excellent overview of the Max-Cut and Max-DiCut problems appears in [11], which we follow in this section. Computing Max-Cut exactly is NP-hard as shown by Karp [26] for the weighted version, and by [18] for the unweighted case. As for approximations, it is impossible to improve upon a 16/17-approximation for Max-Cut and a 12/13-approximation for Max-DiCut unless [37, 23]. If every node chooses a cut side randomly, an expected 1/2-approximation for Max-Cut, a 1/4-approximation for Max-DiCut and a -approximation is achieved. This can be derandomized using the method of conditional expectations. In the breakthrough paper of Goemans and Williamson [22] a 0.878-approximation is achieved using semidefinite programming. This is optimal under the unique games conjecture [27]. In the same paper a 0.796-approximation for Max-DiCut was presented. This was later improved to 0.863 in [MatuuraM01]. Other results using different techniques are presented in [25, 36].

In the distributed setting the problem has not received much attention. A node may choose a cut side at random, achieving the same guarantees as above in constant time. In [24] a distributed algorithm for -regular triangle free graphs which achieves a -approximation ratio in a single communication round is presented. The only results for general graphs in the distributed setting is due to [11]. In the CONGEST model they present a deterministic 1/2-approximation for Max-Cut, a deterministic 1/3-approximation for Max-DiCut, and a randomized expected 1/2 approximation for Max-DiCut running in communication rounds. The results for Max-DiCut follow from adapting the unconstrained submodular maximization algorithm of [9] to the distributed setting. Better results are presented for the LOCAL model; we refer the reader to [11] for the full details.

Max 2-SAT: The decision version of Max 2-SAT is NP-complete [18], and there exist several approximation algorithms [22, 15, 30, 32], of which currently the best known approximation ratio is 0.9401 [30]. In [3] it is shown that assuming the unique games conjecture, the approximation factor of [30] cannot be improved. Assuming only that it cannot be approximated to within a 21/22-factor [23]. To the best of our knowledge the problem of Max 2-SAT (or Max-SAT) was not studied in the distributed model.

Correlation clustering: An excellent overview of correlation clustering (see Section 2 for a definition) appears in [1], which we follow in this section. Correlation clustering was first defined by [4]. Solving the problem exactly is NP-Hard, thus we are left with designing approximation algorithms for the problem, here one can try to approximate max-agree or min-disagree. If the graph is a clique, there exists a PTAS for max-agree [4, 21], and a 2.06-approximation for max-disagree [13]. For general (even weighted) graphs there exists a 0.7666-approximation for max-agree [12, 35], and a -approximation for min-disagree [14]. A trivial 1/2-approximation for max-agree on general graphs can be achieved by considering putting every node in a separate cluster, then considering putting all nodes in a single cluster, and taking the more profitable of the two.

In the distributed setting little is known about correlation clustering. In [10] a dynamic distributed MIS algorithm is provided, it is stated that this achieves a 3-approximation for min-disagree correlation clustering as it simulates the classical algorithm of Ailon et al. [2]. We note that the algorithm of Ailon et al. assumes the graph to be a clique, thus the above result is limited to complete graphs where the edges of the communication graph are taken to be the positive edges, and the non-edges are taken as the negative edges (as indeed for general graphs, the problem is APX-Hard, and difficult to approximate better than [14]). We also note that using only two clusters, where each node chooses a cluster at random, guarantees an expected 1/2-approximation for max-agree on weighted general graphs. We derandomize this approach in this paper.

## 2 Preliminaries

Sequential algorithms: The main goal of this paper is converting (local) sequential graph algorithms for unconstrained maximization (or minimization) to distributed graph algorithms. Let us first define formally this family of algorithms. The sequential algorithm receives as input a graph , we associate each vertex with a variable taking values in some finite set . The algorithm outputs a set of assignments . The goal of the algorithms is to maximize some utility function taking in a graph and the set of assignments and outputting some value in . For simplicity we assume that the order of the variables in does not affect

, so we use a set notation instead of a vector notation. We somewhat abuse notation, and when assigning a variable we write

, meaning that any other assignment to is removed from the set . We also omit as a parameter when it is clear from context.

When considering randomized algorithms we assume the algorithm takes in a vector of random bits denoted by . This way of representing random algorithms is identical to having the algorithm generate random coins, and we use these two definitions interchangeably. The randomized algorithm aims to maximize the expectation of , where the expectation is taken over the random bits of the algorithm.

Max -Cut, Max-DiCut: In this paper we provide fast distributed approximation algorithms to some fundamental problems, which we now define formally. In the Max -Cut problem we wish to divide the vertices into disjoint sets, such that the weight of edges between different sets is maximized. In the Max-DiCut problem the edges are directed and we wish to divide the edges into two disjoint sets, denoted , such that the weight of edges directed from to is maximized.

Max 2-SAT: In the Max 2-SAT problem we are given a set of unique weighted clauses over some set of variables, where each clause contains at most two literals. Our goal is to maximize the weight of satisfied clauses. This problem is more general than the cut problems, so we must define what it means in the distributed context. First, the variables will be node variables as defined before. Second, each node knows all of the clauses it appears in as a literal.

Correlation clustering: We are given an edge weighted graph , such that each edge is also assigned a value from (referred to positive and negative edges). Given some partition, , of the graph into disjoint clusters, we say that an edge agrees with if it is positive and both endpoints are in the same cluster, or it is negative, and its endpoints are in different clusters. Otherwise we say it disagrees with . We aim to find a partition , using any number of clusters, such that the weight of edges that agree with (agreements) is maximized (max-agree), or equivalently the weight of edges that disagree with is minimized (min-disagree).

The problem is usually expressed as an LP using edge variables, where each variable indicates whether the nodes are in the same cluster. This allows a solution to use any number of clusters. In this paper we only aim to achieve a -approximation for the problem. This can be done rather simply without employing the full power of correlation clustering. Specifically, two clusters are enough for our case as we show that we can deterministically achieve agreements which results in the desired approximation ratio.

Local utility functions: We are interested in a type of utility function which we call a local utility function. Before we continue with the definition let us define an operator on assignments , we define . For convenience, when we pass as parameter to a function, we assume that the function also receives the 1-hop neighborhood of which we do not write explicitly. We say that a utility function , as defined above, is a local utility function if for every there exists a function s.t . That is, to compute the change in the utility function which is caused by changing from to , we only need to know the immediate neighborhood of , and the assignment to neighboring node variables. We note that for the cut problems considered in this paper the utility functions are indeed local utility functions. This is proven in the following Lemma:

The utility functions for Max -Cut, Max-DiCut and max-agree correlation clustering with 2 clusters are local utility functions.

###### Proof.

The utility functions for Max -Cut is given by where if and 1 otherwise. Thus, if we fix some it holds that

 f(¯¯¯¯¯X∪{Xv=α′})−f(¯¯¯¯¯X∪{Xv=α}) =∑e=(v,u)∈Ew(e)⋅α′⊕Xu−∑e=(v,u)∈Ew(e)⋅α⊕Xu =∑e=(v,u)∈Ew(e)⋅(α′⊕Xu−α⊕Xu)≜gv(Lv[¯¯¯¯¯X],α′,α)

Because the final sum only depends on vertices , the last equality defines the local function equivalent to the difference, and we are done.

For the problem of Max-DiCut the utility functions is given by , and for max-agree correlation clustering with 2 clusters the utility function is given by ( are the positive and negative edges, respectively), and the proof is exactly the same. ∎

Submodular functions: A family of functions that will be of interest in this paper is the family of submodular functions. A function is called a set function, with ground set . It is said to be submodular if for every it holds that . The functions we are interested in have as their ground set, thus we remain with our original notation, setting and having take in a set of binary assignments as a parameter.

The method of conditional expectations: Next, we consider the method of conditional expectations. Let be some set and , next let

be a vector of random variables taking values in

. We wish to be consistent with the previous notation, thus we treat as a set of assignments. If , then there is an assignment of values such that . We describe how to find the vector . We first note that from the law of total expectation it holds that , and therefore for at least some it holds that . We set this value to be . We then repeat this process for the rest of the values in , which results in the set . In order for this method to work we need it to be possible to compute222This point is critical, and this computation is not simple in many cases. In our case we also need this computation to be done locally at every nodes. We apply this technique to Max-Cut, which meets all of these demands. the conditional expectation of .

Graph coloring: A -coloring for is defined as a function . For simplicity we treat any set of size with some ordering as the set of integers . This simplifies things as we can always consider , which is very convenient. We say that a coloring is a legal coloring if s.t it holds that . An important tool in this paper is defective coloring. Let us fix some -coloring function . We define the defect of a vertex to be the number of monochromatic edges it has. Formally, . We call a -coloring with defect if it holds that . A classic result by Kuhn [28] states that for all an -coloring with defect can be computed deterministically in rounds in the CONGEST model.

In this paper we define a new kind of defective coloring which we call a weighted -defective coloring. Given a positively edge weighted graph and any coloring, for every vertex we denote by its monochromatic edges. Define its weighted defect as . We aim to find a coloring s.t the defect for every is below . We show that the algorithm of Kuhn actually computes a weighted -defective -coloring. We state the following theorem (As the analysis is rather similar to the original analysis of Kuhn, the proof is deferred to the appendix):

For any constant a weighted -defective -coloring can be computed deterministically in rounds in the CONGEST model.

## 3 Orderless-local algorithms

Next we turn our attention to a large family of (potentially randomized) greedy algorithms. We limit ourselves to graph algorithms s.t every node has a variable taking values in some set . We aim to maximize some global utility function . We focus on a class of algorithms we call orderless-local algorithms. These are greedy-like algorithms which may traverse the vertices in any order, and at each step decide upon a value for . This decision is local, meaning that it only depends on the 1-hop topology of and the values of neighboring variables. The decision may be random, but each variable has its own random bits, keeping the decision process local.

The code for a generic algorithm of this family is given in Algorithm 1. The algorithm first initiates the vertex variables. Next it traverses the variables in some order . Each is assigned a value according to some function , which only depends on at the time of the assignment and some random bits which are only used to set the value for that variable. Finally the assignment to the variables is returned. We are guaranteed that the expected value of is at least for any, potentially adversarial, ordering of the variables. Formally, .

We show that this family of algorithms can be easily distributed using coloring, s.t the running time of the distributed version depends on the number of colors. The distributed version, OLDist, is presented as Algorithm 2. The variables are all initiated as in the sequential version, and then the color classes are executed sequentially, while in each color class the nodes execute simultaneously, and send the newly assigned value to all neighbors. Decide does not communicate with the neighbors, so the algorithm finishes in rounds.

It is easy to see that given the same randomness both the sequential and distributed algorithms output the same result, this is because all decisions of the distributed algorithm only depend on the 1-hop environment of a vertex, and we are provided with a legal coloring. Thus, one round of the distributed algorithm is equivalent to many steps of the sequential algorithm. We prove the following lemma:

For any graph with a legal coloring , there exists an order on the variables s.t it holds that for any .

###### Proof.

We prove the claim by induction on the executions of color classes by the distributed algorithm. We note that the execution of the distributed algorithm defines an order on the variables. Let us consider the -th color class. Let us denote these variables as , assigning some arbitrary order within the class. The ordering we analyze for the sequential algorithm would be . Now both the distributed and sequential algorithms follow the same order of color classes, thus we allow ourselves to talk about the sequential algorithm finishing an execution of a color class.

Let be the assignments to all variables of the distributed algorithm after the -th color class finishes execution. And let be the assignments made by the sequential algorithm following until all variable in the -th color class are assigned. Both algorithms initiate the variables identically, so it holds that . Assume that it holds that . The coloring is legal, so for any , s.t it holds that . Thus, when assigning , its neighborhood is not affected by any other assignments done in the color class, so the randomness is identical for both algorithms, and using the induction hypothesis all assignments up until this color class were identical. Thus, for all variables in this color class will be executed with the same parameters for both the distributed and sequential algorithms, and all assignments will be identical. ∎

Finally we show that for any graph with a legal coloring , it holds that . We know from Lemma 3 that for any coloring there exists an ordering s.t for any . The proof is direct from here:

 E→r[f(OLDist(G,→r,φ))]=∑→rPr[→r]f(OLDist(G,→r,φ)) =∑→rPr[→r]f(OL(G,→r,π))=E→r[f(OL(G,→r,π))]≥β(G)

We conclude that any orderless-local algorithm can be distributed, achieving the same performance guarantee on , and requiring communication rounds to finish, given a legal -coloring. We state the following theorem:

Given some utility function , any sequential orderless-local algorithm for which it holds that , can be converted into a distributed algorithm for which it holds that , where is a legal -coloring of the graph. The running time of the distributed algorithm is communication rounds.

### 3.1 Distributed derandomization

We consider the method of conditional expectations in the distributed case for some local utility function , as defined in the preliminaries. Assume that the value of every is set independently at random according to some distribution on which depends only on the 1-hop neighborhood of . We are guaranteed that . Thus in the sequential setting we may use the method of conditional expectations to compute a deterministic assignment to the variables with the same guarantee. We show that because is a local utility function, the method of conditional expectations applied on is an orderless-local algorithm, and thus can be distributed.

Initially all variables are initiated to some value , meaning the variable is unassigned. Let be some partial assignment to the variables. The method of conditional expectations goes over the variables in any order, and in each iteration sets . This is equivalent to , as the subtracted term is just a constant. With this in mind, we present the pseudo code for the method of conditional expectations in Algorithm 3.

To show that Algorithm 3 is an orderless-local algorithm we only need to show that can be computed locally for any . We state the following lemma, followed by the main theorem for this section.

The value can be computed locally.

###### Proof.

It holds that:

 E[f(¯¯¯¯¯X)∣Y,Xv=αv]−E[f(¯¯¯¯¯X)∣Y] =∑α∈AE[f(¯¯¯¯¯X)∣Y,Xv=αv]Pr[Xv=α]−∑α∈AE[f(¯¯¯¯¯X)∣Y,Xv=α]Pr[Xv=α] =∑α∈APr[Xv=α](E[f(¯¯¯¯¯X)∣Y,Xv=αv]−E[f(¯¯¯¯¯X)∣Y,Xv=α])

Where the first equality is due to the law of total expectation and the fact that

. The probability of assigning

to some value can be computed locally, so we are only left with the difference between the expectations. To show that this is indeed a local quantity we use the definition of expectation as a weighted summation over all possible assignments to unassigned variables. Let be the set of all possible assignments to unassigned variables in and let be the set of all possible assignments to the rest of the unassigned variables. It holds that:

 E[f(¯¯¯¯¯X)∣Y,Xv=αv]−E[f(¯¯¯¯¯X)∣Y,Xv=α] =∑Zv∈Uv∑Z∈UPr[Zv]Pr[Z]f(¯¯¯¯¯X∪Zv∪Z∪{Xv=αv})−f(¯¯¯¯¯X∪Zv∪Z∪{Xv=α}) =∑Zv∈Uv∑Z∈UPr[Zv]Pr[Z]gv(Lv[¯¯¯¯¯X∪Zv∪Z],α,αv)=∑Zv∈Uv∑Z∈UPr[Zv]Pr[Z]gv(Lv[¯¯¯¯¯X∪Zv],α,αv) =∑Zv∈UvPr[Zv]gv(Lv[¯¯¯¯¯X∪Zv],α,αv),

where in the first equality we use the definition of expectations and the fact that the variables are set independently of each other. Then we use the definition of a local utility function, and finally the dependence on

disappears due to the law of total probability. The final sum can be computed locally, as the probabilities for assigning variables in

are known and is local. ∎

Let be any graph and a local utility function for which it holds that , where the random assignments to the variables are independent of each other, and depend only on the immediate neighborhood of the node. There exists a distributed algorithm achieving the same value as the expected value for , running in communication rounds in the CONGEST model, given a legal -coloring.

### 3.2 Submodular Maximization

In this section we consider the problem of unconstrained submodular function maximization. Given an submodular function (as defined in Section 2), we aim to find an input s.t the function is maximized. There are no constraints on the input set we pass to the function, hence it is ’unconstrained’. We are interested in finding an approximate solution to the problem, to this end, we consider both the deterministic and randomized algorithms of [9], achieving 1/3 and 1/2 approximation ratios for unconstrained submodular maximization. We show that both can be expressed as orderless-local algorithms for any local utility function. As the deterministic and randomized algorithms of [9] are almost identical, we focus on the randomized algorithm achieving a 1/2-approximation in expectation (Algorithm 5), as it is a bit more involved (The deterministic algorithm appears as Algorithm 4). The algorithms of [9] are defined for any submodular function, but as we are interested only in the case where the ground set is , we will present it as such.

The algorithm maintains two variable assignment , initially , . It iterates over the variables in any order, at each iteration it considers two nonnegative quantities . These quantities represent the gain of either setting in or setting in . Next a coin is flipped with probability , if we set . If we get heads we set in and otherwise we set it to 0 in . When the algorithm ends it holds that , and this is our solution. The deterministic algorithm is almost identical, only that it allows to take negative values, and instead of flipping a coin it makes the decision greedily by comparing .

We first note that the algorithm does not directly fit into our mold, as each vertex has two variables. We can overcome this, by taking to be a binary tuple, the first coordinate stores its value for , and the other for . Initially it holds that , and our final goal function will only take the first coordinate of the variable. We note that because is a local utility function the values can be computed locally, this results directly from the definition of a local utility function, as we are interested in the change in caused by flipping a single variable. Now we may rewrite the algorithm as an orderless-local algorithm, the pseudocode as Algorithm 6.

Using Theorem 3 we state our main result:

For any graph and a local unconstrained submodular function with as its ground set, there exists a randomized distributed 1/2-approximation, and a deterministic 1/3-approximation algorithms running in communication rounds in the CONGEST model, given a legal -coloring.

### 3.3 Fast approximations for cut functions

Using the results of the previous sections we can provide fast and simple approximation algorithms for Max-DiCut and Max -Cut. Lemma 2 guarantees that the utility functions for these problems are indeed local utility functions. For Max-DiCut we use the algorithms of Buchbinder et al., as this is an unconstrained submodular function. For Max -Cut each node choosing a side uniformly at random achieves a approximation, thus we use the results of Section 3.1. Theorem 3.2 and Theorem 3.1 immediately guarantee distributed algorithms, running in communication rounds given a legal -coloring.

Denote by one of the cut algorithms guaranteed by Theorem 3.2 or Theorem 3.1. We present two algorithms, approxCutDet, a deterministic algorithm to be used when is deterministic (Algorithm 8), and, approxCutRand, a randomized algorithm (Algorithm 9) for the case when is randomized. approxCutDet works by coloring the graph using a weighted -defective coloring and then defining a new graph by dropping all of the monochromatic edges. This means that the coloring is a legal coloring for . Finally we call one of the deterministic cut functions. approxCutRand is identical, apart from the fact that nodes choose a color uniformly at random from .

For approxCutDet, the running time of the coloring is rounds, returning a weighted -defective -coloring. The running time of the cut algorithms is the number of colors, thus the total running time of the algorithm is rounds. Using the same reasoning, the running time of approxCutRand is . It is only left to prove the approximation ratio. We prove the following lemma: Let be any graph, and let be a graph resulting from removing any subset of edges from of total weight at most . Then for any constant , any -approximation for Max-DiCut or Max -Cut for is a -approximation for .

###### Proof.

Let be the size of optimal solutions for . It holds that , as any solution for is also a solution for whose value differs by at most (the weight of discarded edges). Assigning every node a cut side uniformly at random the expected cut weight is at least for Max-DiCut and Max -Cut. Using the probabilistic method this implies that . Using all of the above we can say that given a -approximate solution for it holds that:

Lemma 9 immediately guarantees the approximation ratio for the deterministic algorithm. As for the randomized algorithm, let the random variable be the fraction of edges removed, let be the approximation ratio guaranteed by one of the cut algorithms and let be the approximation ratio achieved by approxCutRand. We know that . Applying the law of total expectations we get that . We state our main theorems for this section.

There exists a deterministic -approximation algorithms for Weighted Max -Cut running in communication rounds in the CONGEST model.

There exists a deterministic -approximation algorithm for Weighted Max-DiCut running in communication rounds in the CONGEST model.

There exists a randomized distributed expected -approximation for Weighted Max-DiCut running in communication rounds in the CONGEST model.

#### Correlation clustering

We note the same techniques used for Max-Cut work directly for max-agree correlation clustering on general graphs. Specifically, if we divide the nodes into two clusters, s.t each node selectes a cluster uniformly at random, each edge has exactly probability 1/2 to agree with the clustering, thus the expected value of the clustering is , which is a 1/2-approximation. The above can be derandomized exactly in the same manner as Max-Cut, meaning this is an orderless local algorithm. Finally, we apply the weighted -defective coloring algorithm twice (note that we ignore the sign of the edge), discard all monochromatic edges and execute the deterministic algorithm guaranteed from Theorem 3.1 with a legal coloring. Because there must exists a clustering which has a value at least , a lemma identical to Lemma 9 can be proved and hence we are done. We state the following theorem:

There exists a deterministic -approximation algorithms for weighted max-agree correlation clustering on general graphs, running in communication rounds in the CONGEST model.

## References

• [1] Kook Jin Ahn, Graham Cormode, Sudipto Guha, Andrew McGregor, and Anthony Wirth. Correlation clustering in data streams. In ICML, volume 37 of JMLR Workshop and Conference Proceedings, pages 2237–2246. JMLR.org, 2015.
• [2] Nir Ailon, Moses Charikar, and Alantha Newman. Aggregating inconsistent information: Ranking and clustering. J. ACM, 55(5):23:1–23:27, 2008.
• [3] Per Austrin. Balanced max 2-sat might not be the hardest. In STOC, pages 189–197. ACM, 2007.
• [4] Nikhil Bansal, Avrim Blum, and Shuchi Chawla. Correlation clustering. In FOCS, page 238. IEEE Computer Society, 2002.
• [5] Reuven Bar-Yehuda, Keren Censor-Hillel, Mohsen Ghaffari, and Gregory Schwartzman. Distributed approximation of maximum independent set and maximum matching. In PODC, pages 165–174. ACM, 2017.
• [6] Reuven Bar-Yehuda, Keren Censor-Hillel, and Gregory Schwartzman. A distributed (2 + )-approximation for vertex cover in o(log / log log ) rounds. J. ACM, 64(3):23:1–23:11, 2017.
• [7] Surender Baswana and Sandeep Sen. A simple and linear time randomized algorithm for computing sparse spanners in weighted graphs. Random Struct. Algorithms, 30(4):532–563, 2007.
• [8] Allan Borodin, Morten N. Nielsen, and Charles Rackoff. (incremental) priority algorithms. In SODA, pages 752–761. ACM/SIAM, 2002.
• [9] Niv Buchbinder, Moran Feldman, Joseph Naor, and Roy Schwartz. A tight linear time (1/2)-approximation for unconstrained submodular maximization. SIAM J. Comput., 44(5):1384–1402, 2015.
• [10] Keren Censor-Hillel, Elad Haramaty, and Zohar S. Karnin. Optimal dynamic distributed MIS. In PODC, pages 217–226. ACM, 2016.
• [11] Keren Censor-Hillel, Rina Levy, and Hadas Shachnai. Fast distributed approximation for max-cut. 10718:41–56, 2017.
• [12] Moses Charikar, Venkatesan Guruswami, and Anthony Wirth. Clustering with qualitative information. J. Comput. Syst. Sci., 71(3):360–383, 2005.
• [13] Shuchi Chawla, Konstantin Makarychev, Tselil Schramm, and Grigory Yaroslavtsev. Near optimal LP rounding algorithm for correlationclustering on complete and complete k-partite graphs. In STOC, pages 219–228. ACM, 2015.
• [14] Erik D. Demaine, Dotan Emanuel, Amos Fiat, and Nicole Immorlica. Correlation clustering in general weighted graphs. Theor. Comput. Sci., 361(2-3):172–187, 2006.
• [15] Uriel Feige and Michel X. Goemans. Aproximating the value of two prover proof systems, with applications to MAX 2sat and MAX DICUT. In ISTCS, pages 182–189. IEEE Computer Society, 1995.
• [16] Manuela Fischer, Mohsen Ghaffari, and Fabian Kuhn. Deterministic distributed edge-coloring via hypergraph maximal matching. In FOCS, pages 180–191. IEEE Computer Society, 2017.
• [17] Robert G. Gallager, Pierre A. Humblet, and Philip M. Spira. A distributed algorithm for minimum-weight spanning trees. ACM Trans. Program. Lang. Syst., 5(1):66–77, 1983.
• [18] M. R. Garey, David S. Johnson, and Larry J. Stockmeyer. Some simplified np-complete graph problems. Theor. Comput. Sci., 1(3):237–267, 1976.
• [19] Mohsen Ghaffari, David G. Harris, and Fabian Kuhn. On derandomizing local distributed algorithms. CoRR, abs/1711.02194, 2017.
• [20] Mohsen Ghaffari, Fabian Kuhn, and Yannic Maus. On the complexity of local distributed graph problems. In STOC, pages 784–797. ACM, 2017.
• [21] Ioannis Giotis and Venkatesan Guruswami. Correlation clustering with a fixed number of clusters. Theory of Computing, 2(13):249–266, 2006.
• [22] Michel X. Goemans and David P. Williamson. Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. J. ACM, 42(6):1115–1145, 1995.
• [23] Johan Håstad. Some optimal inapproximability results. J. ACM, 48(4):798–859, 2001.
• [24] Juho Hirvonen, Joel Rybicki, Stefan Schmid, and Jukka Suomela. Large cuts with local algorithms on triangle-free graphs. CoRR, abs/1402.2543, 2014.
• [25] Satyen Kale and C. Seshadhri. Combinatorial approximation algorithms for maxcut using random walks. In ICS, pages 367–388. Tsinghua University Press, 2011.
• [26] Richard M. Karp. Reducibility among combinatorial problems. In Complexity of Computer Computations, The IBM Research Symposia Series, pages 85–103. Plenum Press, New York, 1972.
• [27] Subhash Khot, Guy Kindler, Elchanan Mossel, and Ryan O’Donnell. Optimal inapproximability results for MAX-CUT and other 2-variable csps? SIAM J. Comput., 37(1):319–357, 2007.
• [28] Fabian Kuhn. Weak graph colorings: distributed algorithms and applications. In SPAA, pages 138–144. ACM, 2009.
• [29] Reut Levi and Moti Medina. A (centralized) local guide. Bulletin of EATCS, 2(122), 2017.
• [30] Michael Lewin, Dror Livnat, and Uri Zwick. Improved rounding techniques for the MAX 2-sat and MAX DI-CUT problems. In IPCO, volume 2337 of Lecture Notes in Computer Science, pages 67–82. Springer, 2002.
• [31] Nathan Linial. Locality in distributed graph algorithms. SIAM J. Comput., 21(1):193–201, 1992.
• [32] Shiro Matuura and Tomomi Matsui. 0.863-approximation algorithm for MAX DICUT. In RANDOM-APPROX, volume 2129 of Lecture Notes in Computer Science, pages 138–146. Springer, 2001.
• [33] Matthias Poloczek, Georg Schnitger, David P. Williamson, and Anke van Zuylen. Greedy algorithms for the maximum satisfiability problem: Simple algorithms and inapproximability bounds. SIAM J. Comput., 46(3):1029–1061, 2017.
• [34] Ronitt Rubinfeld, Gil Tamir, Shai Vardi, and Ning Xie. Fast local computation algorithms. In ICS, pages 223–238. Tsinghua University Press, 2011.
• [35] Chaitanya Swamy. Correlation clustering: maximizing agreements via semidefinite programming. In SODA, pages 526–527. SIAM, 2004.
• [36] Luca Trevisan.

Max cut and the smallest eigenvalue.

SIAM J. Comput., 41(6):1769–1786, 2012.
• [37] Luca Trevisan, Gregory B. Sorkin, Madhu Sudan, and David P. Williamson.

SIAM J. Comput., 29(6):2074–2097, 2000.

## Appendix A Extending Kuhn’s algorithm for ϵ-defective coloring

We aim to extend the algorithm of Kuhn [28] to the case of a weighted -defective coloring. Given a positively edge weighted graph and any coloring, for every vertex we denote by its monochromatic and bi-chromatic edges respectively. Define its weighted defect as . We aim to find a coloring s.t the defect for every is below .

We show that Kuhn’s algorithm can be adapted to this problem.

#### The algorithm (Algorithm 1)

: Given some coloring of the graph, each iteration of Kuhn’s algorithm consists of assigning some unique function to every color (We abuse notation and also denote by the function assigned to ). Then every node iterates over all and picks s.t is minimized. We note that the family of functions used is the family of polynomials degree at most over some field. Finally the vertex is assigned the color .

#### Analysis:

We state a lemma analogous to Lemma 4.1 in [28]. We show that it is possible to assign a color to s.t the defect of the vertex increases by at most .

Assume we are given an -coloring of . For a value let the functions assigned to the colors be such that for any two colors , the functions intersect for at most values and s.t that then the increase in the weighted defect of every vertex of the new coloring computed by the algorithm is at most .

###### Proof.

We note that in the proof of Lemma 4.1 in [28], it is shown that if there two neighboring nodes with different colors that choose values s.t then after applying Algorithm 1, the edge will be bi-chromatic.

We show that for every node there exists some value s.t the increase in the weighted defect is at most . Fix some node and assume by contradiction that this is not the case, thus for every it holds that the increase in weight of monochromatic edges is greater than . Let us count the total weight of monochromatic edges for all values of in two different ways:

 k∑e∈Eb(v)w(e)≥∑e=(u,v)∈Eb(v)∑α∈Aw(e)⋅χϕv(α)=ϕu(α) =∑α∈A∑e=(u,v)∈Eb(v)w(e)⋅χϕv(α)=ϕu(α)>|A|ϵ∑e∈Eb(v)w(e)

Thus, we get that which is a contradiction.

Next the construction of such a family of functions remains the same using polynomials. Specifically, restating Theorem 4.6 in [28] using our parameters we get that:

Assume we are given an coloring of the graph. There are an explicit function for and a constant s.t Algorithm 1 computes a coloring s.t the weighted defect of every vertex increases by at most .

Finally we prove a theorem equivalent to Theorem 4.9 in [28] using our parameters. The proof is almost identical to the original one, and we present it here for completeness.

• For any constant a weighted -defective -coloring can be computed deterministically in rounds in the CONGEST model.

###### Proof.

Let be the coloring induced by the unique identifiers of the vertices. Define , where is the smallest positive integer s.t . We run Algorithm 1 times with parameters in the -th iteration.

For all values of this implies that:

 8√CD/ϵi=8√CDϵ−102T−i≤2T−iln(T−1)M≤ln(i−1)M, (1)

where the first inequality is due to the definition of (that is, for the inequality holds in the other direction), and the second is due to the fact that for all it holds that .

Using induction we show that for all we have that

 Mi≤16CD(ϵ−1iln(i)M)2.

For the claim holds by Theorem A (and because ). For the remaining values of , due to Theorem A it holds that . Thus, it is enough to show that . It holds that

 log1/ϵiMi−1 ≤(1)lnMi−1 ≤ln16CD(ϵ−1i−1ln(i−1)M)2=ln16CD(2ϵ−1iln(i−1)M)2 ≤(2)ln(ln(i−1)M)4≤4ln(i)M,

where the first inequality is due to the fact that and the third is due to inequality (1). Finally plugging in we get that: . As the the weighted -defect for every vertex is bounded by , this completes the proof. ∎

## Appendix B Max 2-SAT

In this section we consider the problem of Max 2-SAT. We are given a set of weighted clauses over the set of node variables, such that each clause contains at most two literals. We wish to find an assignment maximizing the weight of satisfied clauses. As before, we are interested in adapting a sequential algorithm to the distributed setting. The algorithm we shall adapt is the sequential algorithm presented in [33] which achieves a 3/4-approximation in expectation for the problem of weighted Max-SAT. It is based on the results of [9], thus it is almost identical to Algorithm 5. Before presenting the algorithm we need some preliminary definitions.

We allow the node variables to take on values in where means that the value to has not yet been assigned. We define two utility functions, the first, counts the weight of clauses satisfied given the assignment, and the second