    # A 2/3-Approximation Algorithm for Vertex-weighted Matching

We consider the maximum vertex-weighted matching problem (MVM) for non-bipartite graphs. In earlier work we have described a 2/3-approximation algorithm for the MVM on bipartite graphs (Dobrian, Halappanavar, Pothen and Al-Herz, SIAM J. Scientific Computing, 2019). Here we show that a 2/3-approximation algorithm for MVM on non-bipartite graphs can be obtained by restricting the length of augmenting paths to at most three. The algorithm has time complexity O(m Δ + n n), where n is the number of vertices, m is the number of edges, and Δ is the maximum degree of a vertex. The approximation ratio of the algorithm is obtained by considering failed vertices, i.e., vertices that the approximation algorithm fails to match but the exact algorithm does. We show that there are two distinct heavier matched vertices that we can charge each failed vertex to. Our proof techniques characterize the structure of augmenting paths in a novel way. We have implemented the 2/3-approximation algorithm and show that it runs in under a minute on graphs with tens of millions of vertices and hundreds of millions of edges. We compare its performance with five other algorithms: an exact algorithm for MVM, an exact algorithm for the maximum edge-weighted matching (MEM) problem, as well as three approximation algorithms. In our test set of nineteen problems, there are graphs on which the exact algorithms fail to terminate in 100 hours. The new 2/3-approximation algorithm for MVM outperforms the other approximation algorithms by either being faster (often by orders of magnitude) or obtaining better weights.

## Authors

04/21/2018

### A 2/3-Approximation Algorithm for Vertex-weighted Matching in Bipartite Graphs

We consider the maximum vertex-weighted matching problem (MVM), in which...
10/06/2021

### An Improved Approximation for Maximum k-Dependent Set on Bipartite Graphs

We present a (1+k/k+2)-approximation algorithm for the Maximum k-depende...
09/09/2020

### Sensitivity Analysis of the Maximum Matching Problem

We consider the sensitivity of algorithms for the maximum matching probl...
11/08/2017

### R(QPS-Serena) and R(QPS-Serenade): Two Novel Augmenting-Path Based Algorithms for Computing Approximate Maximum Weight Matching

In this addendum, we show that the switching algorithm QPS-SERENA can be...
01/30/2018

### A distributed-memory approximation algorithm for maximum weight perfect bipartite matching

We design and implement an efficient parallel approximation algorithm fo...
11/05/2018

### Towards a Unified Theory of Sparsification for Matching Problems

In this paper, we present a construction of a `matching sparsifier', tha...
10/01/2019

### Approximating the Percolation Centrality through Sampling and Pseudo-dimension

In this work we investigate the problem of percolation centrality, a gen...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

We consider a variant of the matching problem in non-bipartite graphs in which weights are assigned to the vertices of a graph, the weight of a matching is the sum of the weights of the matched vertices, and we find a matching of maximum weight. We call this the maximum vertex-weighted matching problem (MVM). In this paper we describe a -approximation algorithm for the MVM that has time complexity, where is the number of vertices, is the number of edges, and is the maximum degree of a vertex. We implement this algorithm as well as a few other approximation algorithms for this problem, and show that the -approximation algorithm runs fast on large graphs, obtains weights that are close to optimal, and is faster or obtains greater weights than the other algorithms on our test set.

Consider an undirected vertex weighted graph , where is the number of vertices, is the number of edges, and is a non-negative weight function on the vertices. The MVM problem can be solved in polynomial time by an exact algorithm [spencer1] with time complexity. We have designed and implemented an exact algorithm with time complexity [Dobrian+:VWM], since it is easier to implement, and it is well-known that the practical performance of a matching algorithm does not necessarily correlate with its worst-case time complexity. We show that this exact algorithm can be slow for many large graphs with millions of vertices and edges, and can even fail to terminate in hours. Thus, there is a need for faster approximation algorithms that can return a matching with a guaranteed fraction of the maximum weight.

Many linear time approximation algorithms have been designed for the maximum edge weighted matching problem (MEM), but we are not aware of earlier approximation algorithms for the MVM problem on non-bipartite graphs. We have designed and implemented a -approximation algorithm for MVM in bipartite graphs [Dobrian+:VWM]. The MVM problem arises in applications such as the design of network switches [tabatabaee1], schedules for training of astronauts [bell1], computation of sparse bases for the null space or the column space of a rectangular matrix [coleman3, pinar1, pothen3], etc.

The MVM problem can be transformed to a maximum edge weighted matching problem (MEM) by assigning each edge a weight obtained by summing the weights at its endpoints. Hence algorithms for the MEM can be used to solve MVM problems. However, we have shown that this transformation can lead to increase in run times for an exact algorithm by three orders of magnitude or more [Dobrian+:VWM]. A simpler and more efficient exact algorithm is obtained by solving the MVM problem directly by processing vertices in non-increasing order of weights and then matching an unmatched vertex to a heaviest unmatched neighbor it can reach by augmenting paths. In this sense the MVM problem is more similar to maximum cardinality matching than MEM. When we consider approximation algorithms, restricting augmenting paths to length three does not lead to -approximation algorithm for the MEM; however -algorithms are available [pettie2004simpler]. The first -approximation algorithm for MVM on bipartite graphs was proposed by us and our coauthors [Dobrian+:VWM]. The idea is to decompose the problem into two ‘one-side-weighted’ problems, solve them individually by restricting the length of augmenting paths to at most three, and then combine the two matchings into a final matching by invoking the Mendelsohn-Dulmage theorem [Mendelsohn+:theorem]. However, the proof technique used for bipartite graphs cannot be extended to non-bipartite graphs because the Mendelsohn-Dulmage theorem applies only to the former. While the algorithm is simple to state, the proof that its approximation ratio is requires several new concepts and involves a careful study of the structure of augmenting paths.

## 2 Background and Related Work

### 2.1 Background on Matchings

We define the basic terms we use here, and additional background on matchings is available in several books, such as [Schrijver:book]. The endpoints of an edge are the vertices and . A matching in a graph is a set of edges that do not share a common end-point; hence at most one edge from is incident on each vertex in the graph. An edge is matched if , and otherwise it is unmatched. A vertex is matched if it is an end point of a matched edge, and otherwise it is unmatched. A heaviest unmatched neighbor of is denoted by ; note that might not be unique, but its weight is.

A path in is a finite sequence of distinct vertices such that for . The length of a path is the number of edges in the path. A cycle is a path such that the first and the last vertex are the same. An alternating path with respect to is a path whose edges alternate between -matched and -unmatched edges. If the first and last vertices on an -alternating path are unmatched, then it is an -augmenting path

, which necessarily has an odd number of edges. The matching

can be augmented by matching the edges in the symmetric difference .

When an augmentation is performed, we distinguish between the origin (the vertex from which an augmenting path search is initiated), and the terminus (the vertex at which the augmenting path search ends). We will denote the origin of the th augmentation step by , and the terminus by , where . When a vertex is not explicitly denoted by or , then it could be either an origin or a terminus, or neither, unless mentioned otherwise. Note that if we begin with the empty matching, then for each matched edge we need one augmentation step, so that is equal to the number of origins or termini. We will say that the -th origin and the -th terminus correspond to each other, so that is the corresponding vertex of the vertex . An alternating path with respect to two matchings, , is a path whose edges alternate between -matched edges and -matched edges.

### 2.2 Related Work

We now describe more fully the work we have done earlier with our colleagues on exact algorithms for MEM on general graphs, and a -approximation algorithm for this problem on bipartite graphs [Dobrian+:VWM]. When vertex weights are non-negative, we can choose a maximum vertex-weighted matching to be one of maximum cardinality, and from now on we assume that this choice has been made.

For a matching , an -reversing path is an alternating path with even number of edges consisting of an equal number of matched and unmatched edges. An -increasing path is an -reversing path whose unmatched endpoint has higher weight than its matched endpoint. By switching the matched and unmatched edges on this path, we can increase the weight of the matching, much as we would using an augmenting path. There are two ways of characterizing maximum vertex-weighted matchings in a general graph. The first is that a matching is an MVM if and only if there is neither an -augmenting path ( nor an

-increasing path in the graph. The second is to list the weights of matched vertices in non-increasing order in a vector (this is the weight vector of the matching

). Then a matching is an MVM if and only if its weight vector is lexicographically maximum among all the weight vectors of matchings.

These two characterizations lead to two extreme algorithms for computing an MVM. The first begins with the empty matching, and at each step matches a currently heaviest unmatched vertex to a heaviest unmatched vertex it can reach by augmenting path. In this algorithm once a vertex is matched, it will always remain matched, since augmentation does not change a matched vertex to an unmatched vertex.

An algorithm with the approximation ratio of for MVM on bipartite graphs was designed and implemented in [Dobrian+:VWM].

The second exact algorithm for solving the MVM problem begins with a maximum cardinality matching. It then looks for increasing paths or cycles with respect to the current matching and terminates when there are none such. This second, speculative, algorithm has the advantage that it has more concurrency whereas the first algorithm has to process vertices in a specified order. We will discuss the speculative algorithm in future work.

Now we turn to approximation algorithms that have been designed for the maximum edge weighted matching problem (MEM). The well-known Greedy algorithm [avis1983survey] iteratively adds a heaviest edge to the matching, and deletes all edges incident on the endpoints of the added edge. This algorithm is -approximate and requires time. Another -approximation algorithm, the Locally Dominant edge algorithm [preis1999linear], avoids sorting the edges by choosing locally dominant edges (an edge that is the heaviest edge incident on both of its endpoints) to add to the matching. A more recent -approximation algorithm is the Suitor algorithm [manne2014new], which employs a proposal-based approach similar to the classical algorithms for stable matching. The Suitor and other algorithms have been extended to find -approximate -Matchings [Khan1]. Other papers improve the performance ratios: For any fixed , - and -approximation algorithms have been proposed [drake2003improved, DuanP10b, Hanke2010, pettie2004simpler, Maue+:matching]. Furthermore, a -approximation algorithm, based on a scaling approach has been proposed by Duan and Pettie [DuanP-approxMWM] which has time complexity . We show that the -approximation algorithm when applied to the MVM problem is significantly slower than the - and -approximation algorithms, and surprisingly, does not compute greater matching weights for relevant values of . The MEM problem appears in applications such as placing large elements on the diagonal of sparse matrices [duff1, duff2], multilevel graph partitioning [karypis1], scheduling, etc.

## 3 A Two-third Approximation Algorithm for MVM

In this section we describe a -approximation algorithm for MVM, discuss its relation to an exact algorithm, and then prove its correctness.

### 3.1 Exact and 2/3-Approximation Algorithms

The approximation algorithm, described in Algorithm 1, sorts the vertices in non-increasing order of weights, and inserts the sorted vertices into a queue . The algorithm begins with the empty matching, and attempts to match the vertices in in the given order. Each unmatched vertex is removed from , and beginning at the algorithm searches for a heaviest unmatched vertex reachable by an augmenting path of length at most three. If such an augmenting path is found, then the matching is augmented by the path that leads to a heaviest unmatched vertex, and the vertex is also removed from . If no augmenting path of length at most three is found, we search from the next heaviest unmatched vertex (even though longer augmenting paths might exist in the graph). The algorithm terminates when all vertices are processed.

The -approximation algorithm may be viewed as one obtained from an exact algorithm for MVM. In the exact algorithm for MVM, at each step we search from a currently heaviest unmatched vertex for a heaviest unmatched vertex reachable by an augmenting path of any length. If an augmenting path is found, we choose the path that leads to a heaviest unmatched vertex, and then augment by this path. If no augmenting path is found, we search from the next heaviest unmatched vertex. This algorithm was proved correct by Dobrian, Halappanavar, Pothen and Al-Herz in [Dobrian+:VWM]. The time complexity of this algorithm is .

Consider running the Exact algorithm and the 2/3-approximation algorithm simultaneously using the vertices in the same queue . Both consider vertices in non-increasing order of weights, and break ties among weights consistently. If a vertex is matched by the exact algorithm but not by the approximation algorithm (because the augmenting path is longer than three), then we call a failure or a failed vertex, because the approximation algorithm failed to match it while the exact algorithm succeeded.

### 3.2 Time Complexity of the Two-Thirds Approximation Algorithm

###### Theorem 3.1

The time complexity of the Two-Thirds approximation algorithm is , where is the maximum degree.

Proof: We sort the adjacency list of each vertex in non-increasing order of weights, and maintain a pointer to a heaviest unmatched neighbor of each vertex. Since the adjacency list is sorted, each list is searched once from highest to lowest weight in the algorithm.

Let be the set of neighbors of a vertex and . In each iteration of the while loop, we choose an unmatched vertex and examine all vertices in to find a heaviest unmatched neighbor, if one exists. If has a matched neighbor , then we form an augmenting path of length three by taking the matched edge , and finding a heaviest unmatched neighbor of . All neighbors of , unmatched and matched, can be found in time, and finding the matched vertex and a heaviest unmatched neighbor can be done in constant time, since the adjacency lists are sorted. Thus the search for augmenting paths in the algorithm takes time. Sorting the adjacency lists takes time proportional to

 ∑ud(u)logd(u)≤∑ud(u)logΔ=mlogΔ.

Sorting the vertices in non-increasing order of weights takes time.

### 3.3 Correctness of the Algorithm

In this subsection we will prove (Theorem 3.7) that Algorithm 1 computes a -approximate MVM, . Let denote the sum of the weights of the failures, the weight of the approximate matching, and the weight of an optimal matching. In order to prove the theorem, it suffices to prove that , since .

To prove that , we show that for every failure there are two distinct vertices that are matched in , with weight at least as heavy as the failure. This is achieved in Lemma 3.6 by considering -alternating paths, using a charging technique in which each failure charges two distinct vertices matched in . Each failure is an endpoint of the -alternating path. The two distinct vertices are obtained as the corresponding vertices (the other ends of the augmenting paths) of two of the first three vertices on the -alternating path.

We prove the approximation ratio by means of several Lemmas. The key Lemma 3.6 is proved using Lemmas 3.23.33.4 and  3.5. We begin by proving each of the latter Lemmas.

###### Lemma 3.2

Let be a matched edge in a matching at some step in the -approximation algorithm, and let be a heaviest unmatched neighbor of . Suppose is changed to a matched edge in a future augmentation step, and let denote a heaviest unmatched neighbor of , then .

Proof: The proof is by induction on , the number of augmentation steps that include on the augmenting path. Let be the matched neighbor of after augmentation steps involving , and let be its heaviest unmatched neighbor . There are two possible augmentation steps that include the matched edge . (1) , and (2) , where () is the origin (terminus) of the augmenting path.

For the base case, ), consider Figure 1. If the augmentation path is , clearly , since the algorithm processes vertices in non-increasing order of weights. If the augmenting path is , then because was matched in preference to .

Assume the claim is true for augmentation steps. By using the same argument as in the base case we have at the -st augmentation step. Now by the inductive hypothesis we have , and by combining the two inequalities, we obtain .

###### Lemma 3.3

Let denote the -approximate matching at the failure , and let be an alternating path that begins with in .
(1) If is an origin of some prior augmentation step, then .
(2) .

Proof: (1) If is an origin , then we have , because was matched in preference to .

(2) In this case, we have to consider three possibilities.
(a) The vertex is an origin, in which case , since was processed before .
(b) The vertex is a terminus that is matched by an augmenting path that includes . An example of this case is shown in Figure 2. In this case we have two possibilities: either is an origin and is the corresponding terminus, or is previously matched in which case we have an augmenting path . In both possibilities was matched in preference to , so .
(c) The vertex is a terminus that is matched by an augmenting path that includes a vertex , where is adjacent to . An example of this case is shown in Figure 3. Let be a heaviest unmatched neighbor of after is matched. In this case, again we have two possibilities: is an origin and is the corresponding terminus, or is previously matched in which case we have an augmenting path . In both possibilities was matched in preference to so we have . By Lemma 3.2 when the matched edge is changed to the matched edge , we have . By combining these two inequalities, we obtain . Figure 2: Lemma 3.3 Case (b): v2 is a terminus that is matched by an augmenting path that includes v1. Figure 3: Lemma 3.3 Case (c): v2 is a terminus that is matched by an augmenting path that includes u≠v1.
###### Lemma 3.4

Let denote the -approximate matching at the failure , and let be an -alternating path that begins with . If the vertex is an origin of some prior augmentation step in the Approximation algorithm, and if , then 1) immediately prior to the step when the Approximation algorithm matches the vertex , the vertex is matched to a vertex , and is a cycle.
2) the -th augmenting path is .

Proof: 1) First we will establish that is matched to some vertex prior to the step when is matched. To obtain a contradiction, assume that is not matched to some vertex prior to the step of matching . Then after is matched, the terminus is either or a vertex that is matched in preference to . In both possibilities we have . We know from Lemma 3.3 that . Combining the two inequalities, we have , which contradicts the assumption in the Lemma.

Now we show that the vertex . Assume for a contradiction that , then at the step of matching there exists an augmenting path from to of length three. After we match , we have , since it was matched in preference to . This again contradicts the assumption in the Lemma.

Now we show that is a cycle by showing that . Assume and let some vertex , as shown in Figure 4. Note that by Lemma 3.2 we have (A), since we know the matching edge is changed to . Also, immediately prior to the step when is matched, there exists an augmenting path of length three from to . So after we match , is either or a vertex that is matched in preference to , so (B). Combining (A) and (B) we get . Thus, . Hence is a cycle since we have established the existence of the edge (the existence of the other two edges of the cycle were established earlier).

2) We establish this result by contradiction as well. Suppose the augmenting path is not . Then we have two cases:
Case 1: The augmenting path is as shown in Figure 5. In this case there must exist an unmatched vertex adjacent to , since after matching the edge it must be changed to by an augmenting path of length three. After matching , assume without loss of generality that becomes . After the augmentation step, we have (A), since there existed an augmenting path from to when was matched. Also, was matched in this step, and it must be changed to the matched edge . By Lemma 3.2 we have (B). Combining (A) and (B), we obtain . Again we have a contradiction of the condition of the Lemma. Figure 5: Lemma 3.4, (2) Case 1: The augmentation step is {oi,v2,u,ti}.

Case 2: The augmentation step does not include the edge as shown in Figure 6. In this case there must exist an unmatched vertex adjacent to since the matched edge must be changed to by an augmenting path of length three. After matching , assume without loss of generality that becomes . After the augmentation step, we have (A), since there existed an augmenting path from to . Note that is still matched and must be changed to . By Lemma 3.2 (B). Again, combining (A) and (B), we obtain .

In both cases we obtain , a contradiction to the condition of the Lemma. Therefore, the -th augmentation step must be . Figure 6: Lemma 3.4 (2) Case 2: the augmentation step does not include the edge (v2,u).
###### Lemma 3.5

Consider the symmetric difference , corresponding to the -approximate matching at the -th failure. Let be an -alternating path, then the alternating subpath will not change in future augmentation steps of the approximation algorithm.

Proof: Assume for the sake of contradiction that after is determined to be a failure, the edge is changed by a future augmenting path of length three, say , as shown in Figure 7. Then, the augmenting path must exist when was determined as a failure, and in this case could not have been a failure. Hence the matched edge in the approximate matching cannot be changed in future augmentations. Figure 7: Lemma 3.5: Augmenting the path {u,v1,v2,q} after fx is determined to be a failure.
###### Lemma 3.6

Consider the symmetric difference , where is the matching computed by the -Approximation algorithm. For every failure there are two distinct matched vertices in that are at least as heavy as .

Proof: First run the approximation algorithm and at the -th augmentation step label the origin by and the terminus by . Recall that we denote as the corresponding vertex of , and vice versa. Consider the symmetric difference between and which results in alternating paths and cycles. We can ignore alternating cycles since every vertex in a cycle is matched in both and . Since failures are matched by the optimal matching but not the approximate matching, they are at the ends of alternating paths.

By Lemma 3.5 the first four vertices of an alternating path beginning with a failure do not change, which makes it possible to identify the origins and termini which are used to construct the alternating path. We will number each failure in the order that it was discovered in the approximation algorithm. A failure could be an end of an alternating path which has one failure or two failures. We will consider these two types of alternating paths in the following.

First consider an alternating path with one failure, and denote the path as . We charge two distinct vertices for as follows:
If the vertex is a terminus, then charge the corresponding origin, which must be at least as heavy as the failure since it was processed before . If is an origin then charge the corresponding terminus, which by Lemma 3.3 (1) must be at least as heavy as .
If the vertex is a terminus, then charge the corresponding origin which must be at least as heavy as the failure since it was processed before . If is an origin, and the corresponding terminus is at least as heavy as , then charge the corresponding terminus. If the corresponding terminus is strictly lighter than , then by Lemma 3.4 we have immediately prior to the step in which is matched, the vertex is matched to some vertex , such that is a cycle, as shown in Figure 8. In this case we consider instead of to find a vertex to charge. If the vertex is a terminus (in a prior augmentation step), then charge the corresponding origin which must be at least as heavy as , since it was processed before the latter. If is an origin in the prior augmentation step, then charge the corresponding terminus which must be at least as heavy as since it was matched in preference to which is an origin. Figure 8: Lemma 3.6, i−2: The corresponding terminus is strictly lighter than the failure fx.

Now we consider an alternating path with two failures and as its endpoints. We assume without loss of generality that .
For the failure we charge two distinct vertices as we did in Part of this Lemma. Now we consider charging for the failure . If the length of the alternating path is at least seven edges, then we can label two alternating subpaths and , and these do not overlap. Hence we can charge two distinct vertices for as we did in Part of the Lemma.

If the length of the alternating path is five then and overlap. Thus , and . So, we charge one vertex for as we did in and we will charge the other distinct vertex as follows.
Case 1: If charged the corresponding vertex of then must charge the corresponding vertex of . Referring to , the vertex charged the corresponding vertex of because must be an origin and the corresponding terminus is strictly lighter than . Let the origin be denoted by , and the corresponding terminus be , for some augmentation step . By Lemma  3.4 we have (1) at the step of matching but before it is matched, is matched to some , where , and is a cycle; (2) the augmenting path is .

We will show that , and thus can be charged to . We consider two subcases:
Subcase 1: is adjacent to , as shown in Figure 9. Note that , since at the step of matching there existed an augmenting path from to .

Subcase 2: The failure is not adjacent to as shown in Figure 10. Note there must exist some unmatched vertex that is adjacent to because after augmenting by the path the matched edge must be changed to , which can be done with an augmenting path of length three. After the augmentation step, we have (A), because there existed an augmenting path from to . After is matched, assume without loss of generality that . By Lemma 3.2, after is changed to we have (B). Combining (A) and (B) we obtain . Figure 10: Lemma 3.6, Case 1, Subcase 2: The failure fy is not adjacent to u.

Case 2: If charged the corresponding vertex of , then must charge the corresponding vertex of . We will show that the corresponding vertex of is at least as heavy as . Suppose that the corresponding vertex is strictly lighter than which is true if it is a terminus, say in the th augmenting step. By Lemma 3.4 we have (1) at the step when the vertex is matched but prior to matching it, the vertex is matched to some , with , such that is a cycle; and (2) the augmenting path is . By symmetry and using the same argument as in Case 1 we get . Since by assumption we have , it follows that .

Note that each matched vertex has a unique corresponding vertex, since once they (the vertex and its corresponding vertex) are matched they will not be unmatched. So, to charge a vertex twice, a vertex must be considered by two failures (and the corresponding vertex of must be charged twice). But two failures cannot consider the same vertex. For two failures in different alternating paths, it is not possible since the alternating paths are vertex disjoint. For two failures in the same alternating path, by our charging method, no two failures consider the same vertex for charging purposes.

###### Theorem 3.7

Algorithm 1 computes a -approximation for the MVM problem.

Proof: Let be the matching computed by the approximation algorithm, and be a matching of maximum vertex weight. Consider all paths in the symmetric difference between and . Let denote the sum of weights of all the failures, let denote the weight of the maximum-weighted matching, and let denote the weight of the approximate matching. Then, , and we know from Lemma 3.6 that since for every failure we have two distinct vertices that are at least as heavy as the failures. Hence . Thus we have . This completes the proof.

## 4 Experiments and Results

### 4.1 Experimental Setup and Algorithms Being Compared

We used an Intel Xeon E5-2660 processor-based system (part of the Purdue University Community Cluster), called Rice for the experiments. The machine consists of two processors, each with ten cores running at 2.6 GHz (20 cores in total) with 25 MB unified L3 cache and 64 GB of memory. The operating system is Red Hat Enterprise Linux release 6.9. All code was developed using C++ and compiled using the g++ compiler (version: 4.4.7) using the -O3 flag. Our test set consists of nineteen real-world graphs taken from the University of Florida Matrix collection [FMC11] covering several application areas. Table 1

gives some statistics on our test set. The graphs are listed in increasing order of the number of vertices. The largest number of vertices of any graph is nearly 51 million, and the largest number of edges is nearly 216 million. For each graph we list the maximum and average vertex degrees and the ratio of the standard deviation of the degrees and the mean degree. The average degrees vary from

to , and the graphs are diverse with respect to their degree distributions. The three kron_g500 graphs of different sizes have high maximum degrees, and high ratios of the standard deviation of the degrees and mean degree, but most problems have low values.

We compare the -approximation algorithm for MVM (we will call this Two-thirds algorithm) with a number of other algorithms.

The Exact algorithm for MVM is similar to Algorithm 1 except that there is no restriction on the augmenting path length, and it is discussed in Section 3, and in more detail in [Dobrian+:VWM]. The complexity of the Exact algorithm we have implemented is . The Spencer and Mayr algorithm [spencer1] has time complexity, but is more complicated to implement, it is not clear if it would lead to better practical performance, and our focus in this paper is on approximation algorithms for MVM with much lower time complexity. We improved the practical performance of the Exact algorithm for MVM by two modifications: (1) If a search for an augmenting path fails, we mark all visited vertices, and when these vertices are encountered in a future search, the algorithm quits searching along those paths. (2) At the step of matching a vertex we find the heaviest unmatched vertex in the sorted list of 5 vertices such that . If an augmenting path from to a vertex is found such that , then the algorithm stops the search and augments the matching.

We have included an exact algorithm for the maximum edge-weighted matching problem (MEM) implemented in LEDA [LEDA, Mehlhorn+:Ledabook] in our comparisons. This is a primal-dual algorithm implemented with advanced priority queues and efficient dual weight updates, with time complexity  [Mehlhorn+:matching]. Since this is a commercial code, we can only run the object code, and we ran it with no initialization and with a fractional matching initialization that obtains a

solution to the linear programming formulation of maximum weighted matching by ignoring the odd-set constraints (computed combinatorially), and then rounding the solution to

values [Applegate+:matching]. We call these two variants LEDA1 and LEDA2, respectively.

The Greedy Half approximation algorithm for MVM (Half) matches the vertices in non-increasing order of weights, matching an unmatched vertex to a heaviest unmatched neighbor, and then deletes other edges incident on the endpoints of the matched edge. Its time complexity is  [Dobrian+:VWM].

We used two implementations (Random and Round-Robin) of the approximation algorithm for MEM due to Pettie and Sanders [pettie2004simpler], and Maue and Sanders [Maue+:matching] with . Before describing each implementation we will describe a 2-augmentation centered at a vertex which is an operation that is used in both implementations. We define an arm of to be either or , where is an unmatched edge, and is a matched edge. The gain of an augmentation or exchange of edges is the increase in weight obtained by the transformation. There are two cases:
Case 1) is unmatched: find an arm of with the highest positive gain.
Case 2) is matched to a vertex : find the highest positive gain by checking the gains of the following paths or cycles:
(1) Alternating cycles of length four that include the edge .
(2) Alternating paths of length at most four, which is done as follows: Find two vertex disjoint arms of , with the highest gains and , then find an arm of with highest gain . If and are vertex disjoint then is a highest gain alternating path; otherwise choose as a highest gain alternating path.
There are two implementations of this algorithm. The Random implementation chooses a random vertex and performs a 2-augmentation centered at with the highest-gain. This is repeated times. The Round-Robin implementation randomly permutes the order of vertices, and for each vertex in the permuted order performs 2-augmentation with the highest-gain centered at . This is repeated for phases. If no further improvement can be achieved after finishing a phase then the algorithm quits. The algorithm can be initialized with the -approximation algorithm called the Global Paths algorithm (GPA) [Maue+:matching], which sorts the edges in non-increasing order of their weights. It constructs sets of paths and cycles of even length by considering the edges in non-increasing order of their weights. Then it computes a maximum weight matching for each path and cycle by dynamic programming, and it deletes the matched edges and their adjacent edges. The algorithm repeats until all edges are deleted. The time complexity of the GPA algorithm is , and that of the Round-robin -approximation algorithm is . Maue and Sanders [Maue+:matching] have reported that the Round-robin implementation with GPA initialization computed heavier matchings than the other three variants albeit at the expense of higher running times; we have obtained similar results, and find that the Round-robin implementation with no initialization was the fastest among the four variants. Hence we report results from these two variants, called RR and GPA-RR, respectively.

The final algorithm we implemented is a -approximate scaling algorithm for MEM (Scaling) due to Duan and Pettie [DuanP-approxMWM], with the choice of , , and . The algorithm is based on a primal dual formulation of the problem with relaxed feasibility and complementary slackness conditions imposed at each scale. The time complexity of the Scaling algorithm is .

In total, we have two exact algorithms for MEM and MVM, and four approximation algorithms. The exact MEM algorithm and the -approximation algorithm have two options for initialization.

Integer weights of vertices were generated uniformly at random in the range , and real-valued weights were chosen randomly in the range . The reported results are average of ten trials of randomly generated weights. The standard deviations for run-time, weight ratio, and cardinality ratio are close to zero, so there is not much variation on these metrics for each algorithm.

### 4.2 Performance of the Algorithms

In Table 2 we group the problems into three sets based on our results. In the first set, the time taken by the Exact MEM algorithm from LEDA without initialization (LEDA1), and the relative performance of the other algorithms (the ratio of the time taken by LEDA1 to the time taken by the other algorithm), are reported. Numbers greater than one indicate that the latter algorithms are faster. For the second set of problems, the LEDA algorithm with no initialization did not complete in four hours. Hence we report the time taken by LEDA2, the code with fractional matching initialization, and relative performance for the other algorithms. For the third set consisting of one problem, none of the exact algorithms completed in 100 hours, and we report the run times of the approximation algorithms.

On the first set of problems, in geometric mean, the exact algorithms LEDA2 and MVM are

and times, respectively, faster than LEDA1; the Scaling algorithm and the GPA-RR algorithms are about and times faster, respectively; and the RR algorithm is times faster; the Two-thirds MVM algorithm is times faster, and the Half algorithm is times faster.

On the second set of problems, the Exact MVM algorithm is slower than the exact MEM algorithm LEDA2 by a factor of about . The approximation algorithms are all faster than LEDA2, the fastest again being the Half algorithm (by a factor of ), and the Two-thirds algorithm is faster by a factor of . The scaling and the RR algorithms are and times faster than LEDA2.

For the nlpkkt200 problem, the Two-thirds algorithm computed the matching in seconds on the integer weights; the Half approximation algorithm took about seconds, while the Scaling algorithm solved the same problem in seconds. The GPA-RR and RR algorithms took and seconds, respectively. This graph has an interesting structure. It comes from a nonlinear programming problem (it is a symmetric Kuhn-Tucker-Karush matrix), which can be partitioned into two subsets of vertices and ; vertices in the set are connected to each other and to vertices in the set , but the latter is an independent set of vertices, i.e., no edge joins a vertex in to another vertex in . There are vertices in and vertices in . This structure creates a large number of augmenting paths for the exact algorithms, and we conjecture this is why these algorithms do not terminate.

We also report the maximum time taken by an algorithm over all problems on which it terminated. For LEDA1, it is seconds on the europe_osm problem; for LEDA2, s on the rgg problem; the Exact MVM algorithm needed s on the huge_bubbles problem. The Scaling algorithm took s on the europe_osm problem, and the Half algorithm took s on the same problem. The problem nlpkkt200 needed the most time for the other approximation algorithms: s for the GPA-RR, s for the RR algorithm, and s for the Two-thirds algorithm.

The runtimes of the some of these algorithms are plotted in a semi-logarithmic plot in Figure 11. Figure 11: Time taken by different algorithms (logarithmic scale), with integer weights in .

We compare the weight of the matching computed by the algorithms with integer weights in range in Table 3. All the exact algorithms compute the same maximum weight, which is reported in the first column; the approximation algorithms compute nearly optimal weights, and in order to differentiate among them, we report the gap to optimality as a percent. Hence we report , where is the weight computed by an algorithm and is the optimal weight computed by the exact algorithms. The Half algorithm computes weights higher than of the optimal, and the Scaling algorithm computes weights higher than of the optimal. The other approximation algorithms obtain weights higher than of the optimal. The best among these is GPA-RR, which it accomplishes taking run times at least times higher than the Two-thirds algorithm. Note that the weights obtained in practice are much better than the worst-case approximation guarantees. These results are plotted in Figure 12. Figure 12: Gaps to optimal weights for different algorithms, with integer weights in .

In Table 5, we report run times from the Exact MVM algorithm and the relative performance of the approximation algorithms when the vertex weights are real-valued in the range . These weights are favorable to the Scaling approximation algorithm, since the number of scales needed is low. LEDA unfortunately does not work with real-valued weights. In geometric mean, the Half algorithm is faster than the Exact MVM algorithm by a factor of ; the Two-thirds algorithm by a factor of ; and the other approximation algorithms are faster by factor less than . On the nlpkkt200 problem, the Exact MVM algorithm did not terminate; notice that the Scaling algorithm is faster with the smaller range of weights here when compared to the integer weights with a larger range. The run times of the other approximation algorithms are consistent with the rankings discussed earlier.

Table 6 includes results for the real-valued weights in the range . The Half approximation algorithm obtains about 89% of the maximum weight (geometric mean of these problems), and is the worst performer. The other approximation algorithms are all comparable in the weights they compute, two or three percent off the optimal. Again the best performer is the GPA-RR algorithm, which it achieves taking about a factor of nine more time than the Two-thirds algorithm.

Now we consider the cardinality of the matchings obtained by the algorithms in Tables 4 and 7. The exact algorithm for MVM computes a maximum cardinality matching when the vertex weights are positive, and since the MEM problems are derived from MVM problems by summing the weights, the MEM algorithms also compute maximum cardinality matchings. The Half approximation algorithm is about ten percent off the maximum cardinality, and the Scaling algorithm about six percent off. The other approximation algorithms are about two percent off the cardinality, with the GPA-RR algorithm the best performer. For eight of the nineteen problems, the exact algorithms obtained perfect matchings (cardinality equal to or ). Similar results are obtained when real weights with a smaller range is used, except that this time the Scaling algorithm finds higher cardinalities. To see how the Two-third algorithm fares against the Scaling algorithm for a smaller , we compared it with - and -Scaling approximation algorithms. For integer weights in the range the Two-third algorithm is times faster than the -approximation, and times faster than the -Scaling approximation algorithm. In geometric mean the Two-third algorithm obtained greater weight by and , and higher cardinality by and , relative to the - and -approximate Scaling algorithm. For real-valued weights in the Two-third algorithm is and times faster than the and -Scaling approximation algorithms, respectively. In geometric mean the Two-third algorithm obtained greater weights by than the -approximation; it was worse by than the -approximation; the cardinality was higher by over the -approximation, and worse by relative to the -approximation.

## 5 Conclusions

We have described an augmentation-based -approximation algorithm for MVM on non-bipartite graphs whose time complexity is , whereas the time complexity of an exact algorithm is . The approximation algorithm is derived in a natural manner from an exact algorithm for computing maximum weighted matchings by restricting the length of augmenting paths to at most three.

The -MVM algorithm has been implemented efficiently in C++, and on a set of nineteen graphs, some with hundreds of millions of edges, it computes the approximate matchings in less than seconds. The weight of the approximate matching is greater than () of the weight of the Optimal matching for these problems on integer weights in (real weights in ). A Greedy Half-approximation algorithm is faster than the -MVM algorithm by about a factor of two, but the weight it computes is lower, and can be as low as on the worst problem. All of these algorithms obtain weights that are much higher than the worst-case approximation guarantees.

In addition, on geometric mean the -MVM algorithm is faster than a Scaling based approximation algorithm by a factor of 28 on the integer weights in range , which is expected due to the large overheads needed for the handling of blossoms and dual variable updates. While the Scaling algorithm is faster on real-valued weights in a narrower range since there are fewer scales, the -MVM algorithm is still faster than it on average by a factor of 7. The -approximate MVM algorithm obtains better matching weight than the Scaling approximation algorithm for relevant values of in all instances on the integer weights, and all but two graphs for the real-valued weights (on those two problems, the weights are close).

The -approximation algorithm for MEM, with Round-robin selection of augmentations and initialization with the Global Paths algorithm, computes higher weights than the -approximation algorithm for MVM, but at a cost of an order of magnitude or more time. The weight differences are quite small for integer weights in a range , but are about for real-valued weights in the range .

We have also compared our algorithms with exact algorithms for the MEM problem from LEDA with a fractional matching initialization, and show that the exact MVM algorithm is quite competitive with it. The -approximation algorithm for MVM is two to three orders of magnitude faster than these exact algorithms, and there are problems on which the exact algorithms do not terminate in hundreds of hours.

Half-approximation algorithms for MEM (e.g., the Locally Dominant edge and Suitor algorithms) do not require sorting and can be used or adapted to obtain -approximate matchings for the MVM. The -approximation algorithm for MVM designed here processes the vertices in non-increasing order of weights, but an algorithm based on the idea of searching for weight-increasing paths and cycles can avoid doing so, leading to a potentially parallel algorithm. This is the scope of our current work.

## Acknowledgements

We thank Jens Maue (Zurich) and Peter Sanders of the Karlsruhe Institute of Technology for sharing the code for the GPA and -approximation algorithms with us.
This work was supported in part by NSF grant CCF-1637534; the U.S. Department of Energy through grant DE-FG02-13ER26135; and the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the DOE Office of Science and the NNSA.