# Coresets Meet EDCS: Algorithms for Matching and Vertex Cover on Massive Graphs

Randomized composable coresets were introduced recently as an effective technique for solving matching and vertex cover problems in various models of computation. In this technique, one partitions the edges of an input graph randomly into multiple pieces, compresses each piece into a smaller subgraph, namely a coreset, and solves the problem on the union of these coresets to find the solution. By designing small size randomized composable coresets, one can obtain efficient algorithms, in a black-box way, in multiple computational models including streaming, distributed communication, and the massively parallel computation (MPC) model. We develop randomized composable coresets of size O(n) that for any constant ε > 0, give a (3/2+ε)-approximation to matching and a (3+ε)-approximation to vertex cover. Our coresets improve upon the previously best approximation ratio of O(1) for matching and O(n) for vertex cover. Most notably, our result for matching goes beyond a 2-approximation, which is a natural barrier for maximum matching in many models of computation. Furthermore, inspired by the recent work of Czumaj et.al. (arXiv 2017), we study algorithms for matching and vertex cover in the MPC model with only O(n) memory per machine. Building on our coreset constructions, we develop parallel algorithms that give a (1+ε)-approximation to matching and O(1)-approximation to vertex cover in only O_ε(n) MPC rounds and O(n) memory per machine. A key technical ingredient of our paper is a novel application of edge degree constrained subgraphs (EDCS). At the heart of our proofs are new structural properties of EDCS that identify these subgraphs as sparse certificates for large matchings and small vertex covers which are quite robust to sampling and composition.

## Authors

• 26 publications
• 7 publications
• 11 publications
• 46 publications
• 7 publications
• ### Improved Massively Parallel Computation Algorithms for MIS, Matching, and Vertex Cover

We present O( n)-round algorithms in the Massively Parallel Computation ...
02/22/2018 ∙ by Mohsen Ghaffari, et al. ∙ 0

• ### A Massively Parallel Algorithm for Minimum Weight Vertex Cover

We present a massively parallel algorithm, with near-linear memory per m...
05/21/2020 ∙ by Mohsen Ghaffari, et al. ∙ 0

• ### Greedy and Local Ratio Algorithms in the MapReduce Model

MapReduce has become the de facto standard model for designing distribut...
06/17/2018 ∙ by Nicholas J. A. Harvey, et al. ∙ 0

• ### Sparsifying Distributed Algorithms with Ramifications in Massively Parallel Computation and Centralized Local Computation

We introduce a method for sparsifying distributed algorithms and exhibit...
07/17/2018 ∙ by Mohsen Ghaffari, et al. ∙ 0

• ### Graph Sparsification for Derandomizing Massively Parallel Computation with Low Space

The Massively Parallel Computation (MPC) model is an emerging model whic...
12/11/2019 ∙ by Artur Czumaj, et al. ∙ 0

• ### Local Algorithms for Bounded Degree Sparsifiers in Sparse Graphs

In graph sparsification, the goal has almost always been of global natur...
05/05/2021 ∙ by Shay Solomon, et al. ∙ 0

• ### Approximating Vertex Cover using Structural Rounding

In this work, we provide the first practical evaluation of the structura...
09/10/2019 ∙ by Brian Lavallee, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

As massive graphs become more prevalent, there is a rapidly growing need for scalable algorithms that solve classical graph problems on large datasets. The salient challenge is that the entire input graph is orders of magnitude larger than the amount of storage on a single processor. For massive inputs, several different computational models have been introduced, each focusing on certain additional resources needed to solve large-scale problems. One example is the streaming model, in which algorithms are allowed to make a single (or a few) passes over the input graph and the target resource is the space being used. Another example is the distributed communication model in which the input is partitioned across multiple parties that can communicate with each other and the resource of interest is the amount of communication between the parties. Yet another example is the massively parallel computation (MPC) model that is a common abstraction of MapReduce-style computation (see Section 2.1 for a definition). The target resources here are the number of rounds of computation and the local storage on each machine.

The three models above, along with their seemingly different target resources, turn out to be closely related. As a result, researchers have focused on designing general algorithmic techniques that can be applicable across wide range of settings. Popular examples of such techniques are linear sketches (see, e.g., [5, 6, 53, 13, 29, 28, 54, 26, 62]) and composable coresets (see, e.g. [14, 15, 17, 49, 65, 64]; see also [61], Section 2.2, for natural composable coresets for connectivity and cut sparsifiers). The main idea behind both these approaches is to partition the data into smaller parts, compute a small-size representative summary of each part separately, combine the summaries, and then recover the final solution from their combination. These techniques have been successfully applied to design efficient algorithms for a wide range of problems across these models. Nevertheless, for the two prominent problems of maximum matching and minimum vertex cover, strong impossibility results are known for these techniques [13]; even a weak approximation ratio of requires summaries of size , essentially no better than outputting the original subgraph itself as the summary.

Very recently, Assadi and Khanna [11] turned to the notion of a randomized composable coreset in order to bypass these strong impossibility results. In this technique, originally introduced by [64] (see also [33]) in the context of submodular maximization, one partitions the input graph randomly and computes a suitable subgraph of each piece, i.e., a coreset, as its representative summary. These subgraphs are said to be composable in that their union yields a subgraph in which the optimal solution to the problem at hand is a good approximation to the original optimal solution. The authors in [11] demonstrated the effectiveness of this technique by designing randomized composable coresets of size that give an and approximation to matching and vertex cover, respectively. These results directly translate to streaming, distributed, and MPC algorithms (under random partitioning of the input in the first two cases). However, while the results in [11] showed a strong gap between the power of randomized composable coresets compared to previous approaches on adversarial partitions, the implication of these results were rather weak and could not compete with the state-of-the-arts algorithms that were designed specifically for each model.

In this paper, we continue the study of randomized composable coresets for matching and vertex cover and present coresets with significantly improved approximation ratios for both problems. Our results imply a unified approach for solving these two problems across different settings and improve the state-of-the-art in the aforementioned computational models in some or all parameters involved. The generality of randomized composable coresets can, in principle, make the problem of designing them harder or even impossible compared to solving the problem on each specific computational model. It is therefore perhaps surprising that using this unified approach, we can design essentially a single algorithm that can improve the state-of-the-art algorithms in all these models simultaneously.

### 1.1 Randomized Composable Coresets

Let be an edge-set of a graph ; we say that a collection of edges is a random -partition of if the sets are constructed by assigning each edge to some chosen uniformly at random. A random -partition of naturally results in partitioning the graph into subgraphs where for all (we use random partitions for both the edge-set and the input graph interchangeably).

###### Definition 1 (Randomized Composable Coreset [64, 11]).

Consider an algorithm ALG that given a graph outputs a subgraph with at most edges. Let be a random -partition of a graph . We say that ALG outputs an -approximation randomized composable coreset of size for a graph optimization problem iff is an -approximation to

with high probability (over the randomness of the random

-partition).

For brevity, we use randomized coresets to refer to randomized composable coresets. Following [11], we augment the definition of randomized coresets by allowing the coresets to also contain a fixed solution (which is counted in the size of the coreset) to be directly added to the final solution of the composed coresets (this is only needed for our vertex cover coreset; see [11] on necessity of this definition for this problem). We recite some of the well-known applications of randomized composable coresets (see Appendix A for definition of these models in details and proofs, and [64, 49] for further applications).

###### Proposition 1.1.

Suppose ALG outputs an -approximation randomized coreset of size for a problem . Let be a graph with edges. Then, ALG implies:

1. A parallel algorithm in the MPC model that with high probability outputs an -approximation to in two rounds with machines, each with memory.

2. A streaming algorithm that on random arrival streams outputs an -approximation to with high probability using space.

3. A simultaneous communication protocol that on randomly partitioned inputs computes an -approximation to with high probability using communication per machine/player.

### 1.2 Our Results

We start by studying the previous randomized coreset of [11] for matching, which was simply to pick a maximum matching of each machine’s subgraph as its coreset. This is arguably the most natural approach to the problem and results in truly sparse subgraphs (maximum degree one). As a warm-up to our main results, we present a simpler and improved analysis (compared to that in [11]), which shows that this coreset achieves a -approximation (vs. -approximation proven in [11]). We also show that there exist graphs on which the approximation ratio of this coreset is at least . This suggests that to achieve a better than approximation, fundamentally different ideas are needed which brings us to our first main result.

Our first main result gives new randomized composable coresets for matching and vertex cover.

[backgroundcolor=lightgray!40,topline=false,rightline=false,leftline=false,bottomline=false,innertopmargin=0pt]

###### Result 1.

There exist randomized composable coresets of size that for any constant , give a -approximation for maximum matching and -approximation for minimum vertex cover with high probability.

Result 1 improves upon the randomized coresets of [11] that obtained and approximation for matching and vertex cover, respectively. Additionally, size of our coresets are optimal up to factor by a lower bound of [11]. Result 1 yields several algorithms for matching and vertex cover across different computational models. Let us exhibit the most interesting ones.

##### The MPC Model.

Maximum matching and minimum vertex cover are among the most studied graph optimization problems in the MPC and similar MapReduce-style computation models  [5, 58, 4, 11, 32, 19]. As an application of Result 1 (by Proposition 1.1), we obtain efficient MPC algorithms for matching and vertex cover in only two rounds of computation.

###### Corollary 1.

There exists MPC algorithms that with high probability achieve (almost) -approximation to matching and (almost) -approximation to vertex cover in two MPC rounds and memory per machine (in general for graphs with edges)666The approximation factor for vertex cover degrades to if one requires local computation on each machine to be polynomial time; see Remarks A.1 and 5.2..

The number of rounds of our algorithms in Corollary 1 is optimal among all MPC algorithms that use memory per machine, as it follows from the results in [13] that any single round MPC algorithm that uses memory per machine cannot achieve a better than -approximation to either problem. Furthermore, if the input is distributed randomly in the first place, our algorithms can be implemented in only one MPC round (see [64] for details on when this assumption applies).

Our algorithms outperform the previous algorithms of [11] for matching and vertex cover in terms of approximation ratio ( vs.  and vs. ), while memory and round complexity are the same. Our matching algorithm outperforms the -approximate maximum matching algorithm of Lattanzi et al. [58] in terms of both the approximation ratio ( vs. ) and round complexity ( vs. ) within the same memory. Our result for the matching problem is particularly interesting as all other MPC algorithms [5, 4, 19] that can achieve a better than two approximation (which is a natural barrier for matching algorithms across different models) require a large (unspecified) constant number of rounds. The improvement on the number of rounds is significant in this context; the round complexity of MPC algorithms determines the dominant cost of the computation (see, e.g. [58, 18]), and hence minimizing the number of rounds is the primary goal in this model.

##### Streaming.

Obtaining a -approximation streaming algorithm for matching (and vertex cover) is trivial within space as one can simply maintain a maximal matching in the stream. Beating the factor of in the approximation ratio of this naive algorithm for matching however has remained one of the central open questions in the graph streaming literature since the introduction of the field in [41]. Currently no -space algorithm is known for this task on adversarially ordered streams and the best lower bound result by Kapralov [51] proves that -approximation requires space by single-pass streaming algorithms. To make progress on this fascinating open problem, Konrad et al. [57] suggested the study of matching in random arrival streams. They presented an algorithm with approximation ratio strictly better than , namely for , in space over random streams. Our Result 1 combined with Proposition 1.1 improves the approximation ratio of this algorithm significantly albeit at a cost of larger space requirement.

###### Corollary 2.

There exists a single-pass streaming algorithm on random arrival streams that uses space and with high probability (over the randomness of the stream) achieves an (almost) -approximation to the maximum matching problem.

This is the first streaming algorithm for matching that beats the ratio of which is known to be “hard” on adversarial streams. In particular, while the lower bound of [51] does not preclude the existence of streaming algorithms with space that achieve better than -approximation on adversarial streams, the proof in [51] (see also [43]) suggests that achieving such algorithm is ultimately connected to further understanding of Ruzsa-Szemerédi graphs, a notoriously hard problem in additive combinatorics (see, e.g., the survey by Gowers [45]). We refer the interested reader to [42, 8] for details on Ruzsa-Szemerédi graphs and to [43, 51, 12] for their connection to the streaming matching problem.

##### Simultaneous Communication Model.

Maximum matching (and to a lesser degree vertex cover) has been studied previously in the simultaneous communication model owing to many applications of this model in achieving round-optimal distributed algorithms [11] to proving lower bounds in dynamic graph streams [7, 56, 13, 12] or applications to mechanism design [36, 9, 35]. As another application of Result 1, we obtain the following corollary in this model.

###### Corollary 3.

There exists simultaneous communication protocols on randomly partitioned inputs that achieve (almost) -approximation to matching and (almost) -approximation to vertex cover with high probability (over the randomness of the input partitioning) with only communication per machine/player.

These results improve upon the and approximation simultaneous protocols of [11] (on randomly partitioned inputs) for matching and vertex cover that were also designed by using randomized coresets. Our protocols achieve optimal communication complexity (up to factors) [11]. Interestingly, when the input is adversarially partitioned, the best approximation ratio achievable by any simultaneous protocol for either matching or vertex cover with communication is only  [13] (see also [11]). It is thus remarkable that a simple data oblivious partitioning scheme, namely, the random partitioning, can make these problems so much more tractable.

Our second main result concerns the MPC model specifically. We build on our coresets in Result 1 to design a memory efficient MPC algorithm for matching and vertex cover in a small number of rounds. [backgroundcolor=lightgray!40,topline=false,rightline=false,leftline=false,bottomline=false,innertopmargin=0pt]

###### Result 2.

There exists an MPC algorithm that with high probability gives an approximation to both maximum matching and minimum vertex cover in MPC rounds using only memory per machine.

The approximation ratio of the matching algorithm in Result 2 can be reduced to (almost) 2 by standard techniques (see Corollary 9). Additionally, as we show in Section 6.6, using the reduction of [60], this approximation ratio can be further improved to for any constant , resulting in the following corollary.

###### Corollary 4.

There exists an MPC algorithm that given a graph and , with high probability computes a -approximation to the maximum matching of in MPC rounds using only memory per machine.

Prior to [32], all MPC algorithms for matching and vertex cover [58, 5, 4] required rounds to achieve approximation when the memory per machine was restricted to (which is arguably the most natural choice of parameter, similar-in-spirit to the semi-streaming restriction [41, 61]). Classical PRAM algorithms [59, 50] already achieved an -approximation for these problems in rounds, implying that the previous MPC algorithms could not benefit from the additional power of MPC model (more storage and local computational power) compared to classical parallel settings such as PRAM unless the memory per machine become as large as .

In a recent breakthrough, Czumaj et al. [32] presented an (almost) -approximation algorithm for maximum matching that requires (even ) memory per machine and only MPC rounds. Result 2 improves upon this result on several fronts: we improve the round complexity of the matching algorithm to , resolving a conjecture of [32] in the affirmative, we obtain an approximation to vertex cover, answering another open question of [32], and we achieve all these using a considerably simpler algorithm and analysis than [32].

### 1.3 Our Techniques

We obtain our coresets in Result 1 using a novel application of edge degree constrained subgraphs (EDCS) that were previously introduced by Bernstein and Stein [20] for maintaining large matchings in dynamic graphs. While previous work on EDCS [20, 21] focused on how large a matching an EDCS contains and how it can be maintained efficiently in a dynamic graph, in this paper we study several new structural properties of the EDCS itself. Our results identify these subgraphs as sparse certificates for large matchings and small vertex covers which are quite robust to sampling and composition, an ideal combination for a randomized coreset.

To prove Result 2, we borrow one simple high-level technique from [32], namely the vertex sampling approach. In this technique, instead of having each machine work on a subgraph obtained by randomly sampling edges in the original graph (as is the case in randomized coresets), each machine samples some fraction of the vertices, and then works with the induced subgraph defined by those vertices. We show that with proper modifications, EDCS used in our coresets in Result 1 are robust enough even under this vertex sampling approach. We use this property to design a recursive procedure in which we repeatedly compute an EDCS of the underlying graph in a distributed fashion, redistribute it again via the vertex sampling approach, and recursively solve the problem on this EDCS to compute an -approximation to matching and vertex cover. We therefore limit the memory on each machine to only (even ) at the cost of increasing the number of rounds from to . Additional ideas are needed to ensure that the approximation ratio of the algorithm does not increase beyond a fixed constant as a result of repeatedly computing an EDCS of the current graph in iterations.

##### Comparison with [32].

As pointed out earlier, in proving our Result 2, we borrowed one simple high-level technique from [32], namely the vertex sampling approach. Other than this starting point, our approach proceeds along entirely different lines from [32], in terms of both the local algorithm computed on each subgraph and in the analysis. The main approach in [32] is round compression, which corresponds to compressing multiple rounds of a particular distributed algorithm into smaller number of MPC rounds by maintaining a consistent state across the local algorithms computed on each subgraph (using a highly non-trivial local algorithm and analysis). Our results on the other hand do not correspond to a round compression approach at all and we do not require any consistency in the local algorithm on each machine. Instead, we rely on structural properties of the EDCS that we prove in this paper, independent of the algorithms that compute these subgraphs. This allows us to bypass many of the technical difficulties arising in maintaining a consistent state across different machines which in turn results in a considerably simpler algorithm and analysis.

### 1.4 Further Related Work

Maximum matching and minimum vertex cover are among the most studied problems in the context of massive graphs including in MPC model and MapReduce-style computation [5, 58, 4, 11, 32, 19], streaming algorithms [60, 41, 37, 38, 5, 43, 57, 6, 3, 46, 51, 52, 31, 30, 61, 4, 40, 56, 13, 29, 63, 39, 12, 72], simultaneous communication model and similar distributed models [46, 36, 48, 9, 13, 12, 11], dynamic graphs [67, 74, 16, 70, 23, 20, 21, 24, 25, 22], and sub-linear time algorithms [71, 47, 68, 69, 76]. Beside the results mentioned already, most relevant to our work are the -space -approximation algorithm of [52]

for estimating the

size of a maximum matching in random stream, and the -approximation communication protocol of [43] when the input is (adversarially) partitioned between two parties and the communication is from one party to the other one (as opposed to simultaneous which we studied). However, the techniques in these results and ours are completely disjoint.

Coresets, composable coresets, and randomized composable coresets are respectively introduced in [2][49], and [64]. Composable coresets have been studied previously in nearest neighbor search [1], diversity maximization [49, 77], clustering [15, 17], and submodular maximization [49, 64, 14, 33, 34]. Moreover, while not particularly termed a composable coreset, the “merge and reduce” technique in graph streaming literature (see [61], Section 2.2) is identical to composable coresets.

## 2 Preliminaries

##### Notation.

For a graph , we use to denote the maximum matching size in and to denote the minimum vertex cover size. For any subset of vertices and any subset of edges , we use to denote the set of vertices in that are incident on edges of and to denote the set of edges in that are incident on vertices of . For any vertex , we use to denote the degree of in the graph .

We use capital letters to denote random variables. Let

and be a sequence of random variables on a common probability space such that for all . The sequence is referred to as a martingale with respect to . A summary of concentration bounds we use in this paper appears in Appendix B.1.

##### Sampled Subgraphs.

Throughout the paper, we work with two different notion of sampling a graph . For a parameter ,

• A graph is an edge sampled subgraph of iff the vertex set of and are the same and every edge in is picked independently and with probability in .

• A graph is a vertex sampled (induced) subgraph of iff every vertex in is sampled in independently and with probability and is the induced subgraph of on

### 2.1 The Massively Parallel Computation (MPC) Model

We adopt the most stringent model of modern parallel computation among [55, 44, 10, 18], the so-called Massively Parallel Computation (MPC) model of [18]. Let with and be the input graph. In this model, there are machines, each with a memory of size and one typically requires that both i.e., polynomially smaller than the input size [55, 10]. Computation proceeds in synchronous rounds: in each round, each machine performs some local computation and at the end of the round machines exchange messages to guide the computation for the next round. All messages sent and received by each machine in each round have to fit into the local memory of the machine. This in particular means that the length of the messages on each machine is bounded by in each round. At the end, the machines collectively output the solution.

### 2.2 Basic Graph Theory Facts

###### Fact 2.1.

For any graph , .

The following propositions are well-known.

###### Proposition 2.2.

Suppose and are respectively, a matching and a vertex cover of a graph such that ; then, both and are -approximation to their respective problems.

.

###### Proposition 2.3.

Suppose is a graph with maximum degree and is the set of all vertices with degree at least in for . Then, .

###### Proof.

By Vizing’s theorem [75], can be edge colored by at most colors. As each color class forms a matching, this means that there exists a matching in with . Moreover, we have , finalizing the proof.

### 2.3 Edge Degree Constrained Subgraph (EDCS)

We introduce edge degree constrained subgraphs (EDCS) in this section and present several of their properties which are proven in previous work. We emphasize that all other properties of EDCS proven in the subsequent sections are new to this paper.

An EDCS is defined formally as follows.

###### Definition 2 ([20]).

For any graph and integers , an edge degree constraint subgraph (EDCS) is a subgraph of with the following two properties:

1. For any edge : .

2. For any edge : .

We sometimes abuse the notation and use and interchangeably.

In the remainder of the paper, we use the terms “Property (P1)” and “Property (P2)” of EDCS to refer to the first and second items in Definition 2 above.

One can prove the existence of an EDCS for any graph and parameters using the results in [21] (Theorem 3.2) which in fact shows how to maintain an EDCS efficiently in the dynamic graph setting. As we are only interested in existence of EDCS in this paper, we provide a simpler and self-contained proof of this fact in Appendix B.2, which also implies a simple polynomial time algorithm for computing any EDCS of a given graph .

###### Lemma 2.4.

Any graph contains an EDCS for any parameters .

It was shown in [20] (bipartite graphs) and [21] (general graphs) that for appropriate parameters and EDCS always contains an (almost) -approximate maximum matching of . Formally:

###### Lemma 2.5 ([20, 21]).

Let be any graph and be a parameter. For parameters , , and , in any subgraph , .

Lemma 2.5 implies that an EDCS of a graph preserves the maximum matching of approximately. We also show a similar result for vertex cover. The basic idea is that in addition to computing a vertex cover for the subgraph (to cover all the edges in ), we also add to the vertex cover all vertices that have degree at least in , which by Property (P2) of an EDCS covers all edges in .

###### Lemma 2.6.

Let be any graph, be a parameter, and for parameters and . Suppose is the set of vertices with and is a minimum vertex cover of ; then is a vertex cover of with size at most (note that ).

###### Proof.

We first argue that is indeed a feasible vertex cover of . To see this, notice that any edge is covered by , and moreover by Property (P2) of EDCS, any edge has at least one endpoint with degree at least in and hence is covered by . In the following, we bound the size of by , which finalizes the proof as clearly .

By Property (P1) of EDCS, the maximum degree of each vertex in is . Moreover, for any vertex , we have . Hence, we can apply Proposition 2.3 on the graph with parameters and , and obtain,

 {VC}(H)≥Fact~{}???{MM}(H)≥Proposition~{}???1(2+ε)⋅∣∣V\sc high∣∣.

finalizing the proof.

## 3 Warmup: A 3-Approximation Coreset for Matching

A natural randomized coreset for the matching problem was previously proposed by [11]: simply compute a maximum matching of each graph . We refer to this randomized coreset as the MaxMatching coreset. It was shown in [11] that MaxMatching is an -approximation randomized coreset for the matching problem (the hidden constant in the O-notation was bounded by in [11]). As a warm up, we propose a better analysis of this randomized coreset in this section.

###### Theorem 5.

Let be a graph with and be a random -partition of . Any maximum matching of the graph is a -approximation randomized composable coreset of size for the maximum matching problem.

##### Assumption on {MM}(G).

In this section, we follow [11] in assuming that since otherwise we can immediately obtain a (non-randomized) composable coreset with approximation ratio one (an exact maximum matching) and size for the matching problem using the results in [29]. We emphasize that this assumption is only needed for the results in this section.

A crucial building block in our proof of Theorem 5 is a new concentration result for the size of maximum matching in edge sampled subgraphs that we prove in the next section. This result is quite general and can be of independent interest.

### 3.1 Concentration of Maximum Matching Size under Edge Sampling

Let be any arbitrary graph and be a parameter (possibly depending on size of the graph ). Define as a subgraph of obtained by sampling each edge in independently and with probability , i.e., an edge sampled subgraph of . We show that is concentrated around its expected value.

###### Lemma 3.1.

Let be any arbitrary graph, be a parameter, and . For any ,

 Pr(∣∣{MM}(G{E}p)−E[{MM}(G{E}p)]∣∣≥λ)≤2⋅exp(−λ2⋅p2⋅μ).
###### Proof.

For simplicity, define . Let be any minimum vertex cover in the graph . We use vertex exposure martingales over vertices in to prove this result. Fix an arbitrary ordering of vertices in and for any , define as the set of vertices in that appear before in this ordering. For each , we define a random variable

as a vector of indicators whether a possible edge (i.e., an edge already in

) between the vertices and appears in or not. Since is a vertex cover of , every edge in is incident on some vertex of . As a result, the graph is uniquely determined by the vectors . Define a sequence of random variables , where . The following claim is standard.

###### Claim 3.2.

The sequence is a martingale with respect to the sequence .

###### Proof.

For any ,

 E[Xi∣Y1,…,Yi−1] =E(y1,…,yi−1)[{MM}(Gp)∣Y1=y1,…,Yi−1=yi−1] (as we are “averaging out” Yi in the outer expectation) =E[{MM}(Gp)∣Y1,…,Yi−1]=Xi−1.minus5.0pt\vruleheight7.5ptwidth5.0ptdepth2.5pt

Notice that and as fixing uniquely determines the graph . Hence, we can use Azuma’s inequality to show that value of is close to with high probability. To do this, we need a bound on , as well as each term . Bounding each term is quite easy; the set of edges incident on the vertex can only change the maximum matching in by (as can only be matched once), and hence . In the following, we also bound the value of .

.

###### Proof.

Since size of a minimum vertex cover of a graph is at most twice the size of its maximum matching (Fact 2.1), we have that . It is also straightforward to verify that , since fraction of the edges of any maximum matching of appear in in expectation; hence .

We are now ready to finalize the proof. By setting for all , we can use Azuma’s inequality (Proposition B.1) with parameters and for the martingales , and obtain that,

 Pr(∣∣{MM}(Gp)−E[{MM}(Gp)]∣∣≥λ) =Pr(∣∣X|C|−X0∣∣≥λ)≤Proposition~{}???2⋅exp(−λ2∑i∈Cc2i) =2⋅exp(−λ2|C|)≤Claim~{}???2⋅exp(−λ2⋅p2⋅μ)

finalizing the proof.

### 3.2 Proof of Theorem 5

Let be any arbitrary graph and be a random -partition of . Recall that MaxMatching coreset simply computes a maximum matching on each graph for ; hence, we only need to show that the graph has a large matching compared to the graph .

Let be any fixed maximum matching in , and let . Our approach is to show that either each graph has a large matching already, i.e., , or many edges of are picked in as well. In the latter case, the union of edges in for has a large intersection with and hence contains a large matching.

Define where . Let be the intersection of the graph and . Finally, define as the maximum matching size in . Using our concentration result from the previous section, we can show that,

###### Claim 3.4.

Let be a parameter. Suppose ; then, there exists an integer such that with probability (over the random -partition), simultaneously for all .

###### Proof.

Let ; the graph is a subgraph of obtained by picking each edge in independently and with probability . Let (notice that the marginal distribution of graphs for all are identical). By setting in Lemma 3.1, we have that,

 Pr(∣∣μ−i−μ−∣∣≥λ) ≤Lemma~{}???2⋅exp(−ε2⋅μ2⋅p2⋅μ)≤2⋅exp(−2logn)≤1n2

where the second inequality is by the assumption on the value of . Taking a union bound over all subgraphs for finalizes the proof.

In the following, we condition on the event in Claim 3.4. We now have,

###### Lemma 3.5.

Let , and be as in Claim 3.4. If , then w.p. .

###### Proof.

Fix an index and notice that conditioning on the event in Claim 3.4, only fixes the set of edges in . Let be any maximum matching in ; by definition, . By conditioning on the event in Claim 3.4, we have . It is straightforward to verify that there are at least edges in such that neither of endpoints of are matched by . We refer to these edges as free edges and use to denote them.

Note that even after conditioning on , the edges in , and consequently , appear in the graph independently and with probability . As such, using a Chernoff bound (by assumption on the value of ), w.p. , edges of appear in . Since these edges can be directly added to the matching (as neither endpoints of them are matched in ), this implies that there exists a matching of size in w.p. .

Now let be the maximum matching computed by MaxMatching; the above argument implies that . On the other hand, notice that as forms a matching in the graph and denotes the maximum matching size in this graph. This means that . To finalize the proof, notice that by a union bound over all matchings , we have that with probability ,

 ∣∣ ∣∣k⋃i=1Mi∩M⋆∣∣ ∣∣=k∑i=1|Mi∩M⋆|≥k⋅1k⋅(μ/3−3ε⋅μ)=μ/3−3ε⋅μ.minus5.0pt\vruleheight7.5ptwidth5.0ptdepth2.5pt

We can now easily prove Theorem 5.

###### Proof of Theorem 5.

By our assumption that , we can take in Claim 3.4 and Lemma 3.5 to be some arbitrary small constant, say . Define as in Lemma 3.5. If , we are already done as by Claim 3.4, for any , and hence the union of matchings surely has a approximate matching. On the other hand, if , we can apply Lemma 3.5, and argue that edges of the matching appear in the union of matchings , which finalizes the proof.

We also show that there exists a graph for which the approximation ratio of MaxMatching is arbitrarily close to . This implies that one cannot improve the analysis of MaxMatching much further and in particular beat the approximation ratio of using this coreset.

###### Lemma 3.6.

There exists a graph such that for any random -partition of ( for any constant ), the MaxMatching coreset can only find a matching of size at most with high probability.

We defer the proof of this lemma to Appendix C.

## 4 New Properties of Edge Degree Constrained Subgraphs

We study further properties of EDCS in this section. Although EDCS was used prior to our work, all the properties proven in this section are entirely new to this paper and look at the EDCS from a different vantage point.

Previous work in [20, 21] studied the EDCS from the perspective of how large of matching it contains and how it can be maintained efficiently in a dynamically changing graph. In this paper, we prove several new interesting structural properties of the EDCS itself. In particular, while it easy to see that in terms of edge-sets there can be many different EDCS of some fixed graph (consider being a complete graph), we show that the degree distributions of every EDCS (for the same parameters and ) are almost the same. In other words, the degree of any vertex is almost the same in every EDCS of . This is in sharp contrast with similar objects such as maximum matchings or -matchings, which can vary a lot within the same graph.

This semi-uniqueness renders the EDCS extremely robust under sampling and composition as we prove next in this section. These new structural results on EDCS are the main properties that allows their use in our coresets and parallel algorithms in the rest of the paper. In fact, our parallel algorithms in Section 6 are entirely based on these results and do not rely at all on the fact that an EDCS contains a large matching (i.e., do not depend on Lemma 2.5 at all).

### 4.1 Degree Distribution Lemma

In the following lemma, we argue that any two EDCS of a graph (for the same parameters ) are “somewhat identical” in that their degree distributions are essentially the same.

###### Lemma 4.1 (Degree Distribution Lemma).

Fix a graph and parameters (for ). For any two subgraphs and that are EDCS, and any vertex ,

 |dA(v)−dB(v)|=O(logn)⋅λ1/2⋅β.

In the rest of this section, we fix the parameters and the two EDCS and in Lemma 4.1. The general strategy of the proof is as follows. We start with a set of vertices that have the most difference in degree between and . For simplicity, assume vertex degrees are all larger by an additive factor of in compared to . We look at all neighbors of in , denoted by , and then all neighbors of in plus the original set which we denote by . We then use the two properties of EDCS in Definition 2 to prove that the vertices in still have a larger degree in compared to by an additive factor which is only slightly smaller than , while the size of is a constant factor larger than . The main observation behind this claim is that since vertices in have a “large” degree in , their neighbors in in should have a “small” degree to satisfy Property (P1) of EDCS . Similarly, since vertices in have a “small” degree in , and since the edges in are missing from the EDCS , by Property (P2) of EDCS the vertices in should have a “large” degree in . Applying this idea one more time to vertices in in order to obtain the set , we can see that vertices in have a “large” degree in and a “small” degree in , and that the decrease in the original gap between the degree of vertices in vs is only .

We use this argument repeatedly to construct the next set and so on. With each iteration the new set is larger than the previous one by a constant multiplicative factor, while the -vs.- degree gap only decreases by a small additive factor from to . Now on one hand we can keep iterating this process as long as the -vs.- degree gap remains /2. On the other hand, a multiplicative increase in set size can only happen for steps before we run out of vertices. Thus, the gap must decrease from to after only steps of a small additive decrease, which gives us an upper bound on the original gap .

We now formalize this argument, starting with a technical lemma which allows us to obtain each set from the set in the above argument.

###### Lemma 4.2.

Fix an integer and suppose is such that for all , we have . Then, there exists a set of vertices such that and for all , .

###### Proof.

We define the following two sets and :

• is the set of all neighbors of vertices in using only the edges in . In other words, .

• is the set of all neighbors of vertices in using only the edges in . In other words, .

We start by proving the following property on the degree of vertices in the sets and .

We have,

• for all , .

• for all , .

###### Proof.

For the first part, since , it means that there exists an edge such that . Since belongs to , by Property (P1) of an EDCS we have . On the other hand, since does not belong to , by Property (P2) of an EDCS we have , completing the proof for vertices in .

For the second part, since , it means that there exists an edge such that . Since does not belong to , by Property (P2) of an EDCS we have . Moreover, since belongs to , by Property (P1) of an EDCS, we have, . This means that which is at least by the first part.

Notice that since , by Claim 4.3, for any vertex , we have and hence (similarly, , but and may intersect). We define the set in the lemma statement to be . The bound on the degree of vertices in follows immediately from Claim 4.3 (recall that vertices in already satisfy the degree requirement for the set ). In the following, we show that , which finalizes the proof.

Recall that and denote the set of edges in subgraph incident on vertices , and between vertices and , respectively. We have,

 ∣∣EB∖A(T,T′∖S)∣∣ =∣∣EB∖A(T)∣∣−∣∣EB∖A(T,S)∣∣ (as all the edges in B∖A that are incident on T are going to T′) ≥∣∣EA∖B(T)∣∣−∣∣EB∖A(T,S)∣∣ (as by Claim 4.3, the degree of vertices in T is larger in B∖A compared to A∖B) ≥∣∣EA∖B(S)∣∣−∣∣EB∖A(S)∣∣ (as all edges in A∖B incident on S are also incident on T) ≥|S|⋅D (by the assumption on the degree of vertices in S in subgraphs A and B)

Finally, since is an EDCS, the maximum degree of any vertex in is at most and hence there should be at least vertices in (as ).

###### Proof of Lemma 4.1.

Suppose towards a contradiction that there exists a vertex s.t. (the other case is symmetric). Let and and for to : define the set and integer by applying Lemma 4.2 to and (i.e., and ). By the lower bound on the value of , for any , we have that , and hence we can indeed apply Lemma 4.2. As a result, we have,

 |St|≥(1+2λ1/2)⋅|St−1|≥(1+2λ1/2)t⋅|S0|≥exp(λ1/2⋅t)>exp(ln(n))=n.

which is a contradiction as there are only vertices in the graph . Consequently, we obtain that for any vertex , , finalizing the proof.

### 4.2 EDCS in Sampled Subgraphs

In this section, we prove two lemmas regarding the structure of different EDCS across sampled subgraphs. The first lemma concerns edge sampled subgraphs. We show that the degree distributions of any two EDCS for two different edge sampled subgraphs of is almost the same no matter how the two EDCS are selected or even if the choice of the two subgraphs are not independent.

###### Lemma 4.4 (EDCS in Edge Sampled Subgraphs).

Fix any graph and . Let and be two edge sampled subgraphs of with probability (chosen not necessarily independently). Let and be arbitrary EDCSs of and with parameters . Suppose , then, with probability , simultaneously for all :

 ∣∣dH1(v)−dH2(v)∣∣≤O(logn)⋅λ1/2⋅β.

We also prove a qualitatively similar lemma for vertex sampled subgraphs. The main difference here is that there will be a huge gap between the degree of a vertex between the two EDCS if the vertex is sampled in one subgraph but not the other one. However, we show that the degree of vertices that are sampled in both subgraphs are almost the same across the two different (and arbitrarily chosen) EDCS for the subgraphs.

###### Lemma 4.5 (EDCS in Vertex Sampled Subgraphs).

Fix any graph and . Let and be two vertex sampled subgraphs of with probability (chosen not necessarily independently). Let and be arbitrary EDCSs of and with parameters