A 2/3-Approximation Algorithm for Vertex-weighted Matching in Bipartite Graphs

We consider the maximum vertex-weighted matching problem (MVM), in which non-negative weights are assigned to the vertices of a graph, the weight of a matching is the sum of the weights of the matched vertices, and we are required to compute a matching of maximum (vertex) weight. Although the MVM problem can be solved as a maximum edge-weighted matching problem (MEM), we show that it has more efficient exact and approximation algorithms than those available for the MEM. First, we describe an exact algorithm for MVM with O(|V| |E|) time complexity. Then we show that a 2/3-approximation algorithm for MVM on bipartite graphs can be obtained by restricting the length of augmenting paths in the exact algorithm to at most three. The algorithm has time complexity O(|E| + |V| |V|). We have implemented the 2/3-approximation algorithm and compare it with an exact MEM algorithm, the exact MVM algorithm that we have designed, 2/3- and 1/2-approximation algorithms for MVM, and a scaling-based primal-dual (1-ϵ)-approximation algorithm for MEM. On a test set of nineteen problems with several millions of vertices, we show that the 2/3-approximation MVM algorithm is about 60 times faster than the exact MEM algorithm, 5 times faster than the exact MVM algorithm, and 10-15 times faster than the scaling-based 2/3- or 5/6-approximation algorithms (geometric means). It obtains more than 99.5% of the weight and cardinality of an MVM, whereas the scaling-based approximation algorithms yield lower weights and cardinalities. The maximum time taken by the exact MEM algorithm on a graph in the test set is 15 hours, while it is 22 minutes for the exact MVM algorithm, and less than 5 seconds for the 2/3-approximation algorithm.

READ FULL TEXT VIEW PDF

page 1

page 2

page 3

page 4

02/15/2019

A 2/3-Approximation Algorithm for Vertex-weighted Matching

We consider the maximum vertex-weighted matching problem (MVM) for non-b...
01/30/2018

A distributed-memory approximation algorithm for maximum weight perfect bipartite matching

We design and implement an efficient parallel approximation algorithm fo...
07/21/2018

Faster Exact and Approximate Algorithms for k-Cut

In the k-cut problem, we are given an edge-weighted graph G and an integ...
08/05/2020

An Algorithm Framework for the Exact Solution and Improved Approximation of the Maximum Weighted Independent Set Problem

The Maximum Weighted Independent Set (MWIS) problem, which considers a g...
05/21/2020

New Approximation Algorithms for Maximum Asymmetric Traveling Salesman and Shortest Superstring

In the maximum asymmetric traveling salesman problem (Max ATSP) we are g...
06/07/2005

An Efficient Approximation Algorithm for Point Pattern Matching Under Noise

Point pattern matching problems are of fundamental importance in various...
10/24/2018

Faster approximation algorithms for computing shortest cycles on weighted graphs

Given an n-vertex m-edge graph G with non negative edge-weights, the gir...

1 Introduction

We consider a variant of the matching problem in graphs in which weights are assigned to the vertices, the weight of a matching is the sum of the weights on the matched vertices, and we are required to compute a matching of maximum weight. We call this the maximum vertex-weighted matching problem (MVM). In this paper we describe a -approximation algorithm for MVM in bipartite graphs and implement it efficiently. We compare its performance with several algorithms: an algorithm for computing maximum edge-weighted matchings, an (exact) algorithm for MVM, a -approximation Greedy algorithm, and a -approximation algorithm for maximum edge weighted matchings.

Matching is a combinatorial problem that has been extensively studied since the 1960’s. The problem of computing a matching with the maximum cardinality of edges is the maximum cardinality matching (MCM) problem. When weights are assigned to the edges, the problem of computing a matching with the maximum sum of weights of the matched edges is the maximum edge-weighted matching problem (MEM). There are other variants too: One could ask for a matching that has the maximum or minimum sum of edge weights among all maximum cardinality matchings. Or one could ask for a maximum bottleneck matching, where we seek to maximize the minimum weight among all matched edges. While there have been a number of papers on the MCM and MEM problems, there has been little prior work on the MVM problem. Let denote a bipartite graph with vertices and edges. Spencer and Mayr [34] have described an algorithm to solve the MVM exactly with time complexity. We do not know of prior work on approximation algorithms for the MVM problem. Background information on matchings and approximation algorithms is provided in the next Section.

For the MCM problem on bipartite graphs, it is now known that algorithms with worst-case time complexity are among the practically fastest algorithms relative to asymptotically faster algorithms, e.g., the Hopcroft-Karp algorithm with complexity. Among these algorithms are a Multiple-Source BFS-based algorithm, a Multiple-Source DFS-based algorithm, and a Push-Relabel algorithm [2, 11, 31]. All of these algorithms employ greedy initialization algorithms such as the Karp-Sipser algorithm, and employ other enhancements to make the implementations run fast. These algorithms have also been implemented in parallel on modern shared-memory multithreaded processors.

An MVM problem can be transformed into an MEM problem by summing the weights of an endpoints of an edge and assigning the sum as the weight of an edge. Thus an exact or approximation algorithm for MEM becomes an exact or approximation algorithm for MVM as well. However, we show in the experimental section that when edge weights are derived this way, they are highly correlated, adversely affecting the run times of the algorithms.

It is possible to design exact and approximation algorithms that do not make use of linear programming formulations for MVM. These algorithms are conceptually simpler, easier to implement, and provide insights into the “structure” of the MVM problem. The

- and -approximation algorithms have near-linear time complexity with small constants, and in practice, are quite fast and deliver high quality approximations. A -approximation algorithm for the MVM problem is obtained by limiting the augmenting path lengths to , but this technique does not lead to a -approximation for MEM.

MVM problems arise in many contexts, such as the design of network switches [35], schedules for training of astronauts [3], computation of sparse bases for the null space or the column space of a rectangular matrix [5, 30]

, etc. Our interest in this problem was sparked by our work on sparse bases for the null space and column space of rectangular matrices. The null space basis is useful in successive quadratic programming (SQP) methods to solve nonlinear optimization problems; here the matrix sizes are determined by the number of constraints in the optimization problem. In this context a matroid greedy algorithm can be shown to compute a sparsest such basis. For the null space basis, the problem still remains NP-hard since computing a sparsest null vector is already NP-hard. However, for the column space basis, this leads to a polynomial time algorithm, and it can be efficiently implemented by computing a maximum weight vertex-weighted matching. As will be seen from our results, approximation algorithms are needed to make this algorithm practical since an optimal algorithm can take several hours on large graphs. We suspect that a number of problems that could be modeled as MVM problems have been modeled in earlier work as MEM problems due to the extensive literature on the latter problem.

The remainder of this paper is organized as follows. Section 2 provides background on matching, and describes the concepts of -reversing and -increasing paths in a graph. The next Section 3 characterizes MVMs using the concept of weight vectors as well as augmenting paths and increasing paths. An exact algorithm for MVM is briefly described in Section 4, and Section 5 describes a -approximation algorithm for MVM on bipartite graphs. Next, Section 6 proves the correctness of the -approximation algorithm. Section 7 briefly discusses the Greedy -approximation algorithm and its correctness. Computational results on the the exact MEM and MVM algorithms, the Greedy -approximation, -approximation, and scaling-based -approximation algorithms are included in Section 8. We conclude in the final Section 9.

Preliminary versions of these results were included in the PhD thesis of one of the authors [14], and in an unpublished report [7].

2 Background

We define the basic terms we use here, and refer the reader to discussions in the following books for additional background on matching theory [4, 22, 23, 28, 33], and approximation algorithms [36, 37].

A matching in a graph is a set of edges such that for each vertex , at most one edge in is incident on . An edge in is a matched edge, and otherwise, it is an unmatched edge. Similarly a vertex which is an endpoint of an edge in is matched, and otherwise it is an unmatched vertex.

A path in a graph is a sequence of distinct vertices such that consecutive vertices form an edge in the graph. The length of a path is the number of edges (not the number of vertices) in it. A cycle is a path in which the first and last vertices are the same. Given a matching in a graph, an -alternating path is a path in which matched and unmatched edges alternate. If the first and the last vertices in an -alternating path are unmatched, then it is an -augmenting path, since by flipping the matched and unmatched edges along we obtain a new matching that has one more edge than the original matching. The augmented matching is obtained by the symmetric difference

. An augmenting path must have an odd number of edges since it has one more unmatched edge than the number of matched edges. Figure 

1 shows an augmenting path joining vertices and in the top figure of Subfigures (1) and (2). Here solid edges are matched, and the dashed edges are unmatched. (The vertices , , and the neighbor of are not involved in this path.) The results of an augmentation are shown in the bottom figures of these Subfigures.

An -reversing path is an alternating path with one endpoint -matched and the other endpoint -unmatched. Such a path has even length (number of edges), and half the edges are matched and the other half are unmatched. In Figure 1, the paths joining and in the bottom figures of Subfigures (1) and (2) are -reversing paths. We can exchange the matched and unmatched edges on a reversing path without changing the cardinality of the matching. However, it might be possible to increase the weight of a vertex-weighted matching in this way. An -increasing path is an -reversing path such that its -unmatched endpoint is heavier than its -matched endpoint . Let denote the vertex weight of a vertex . For an increasing path, we have . If we exchange matched and unmatched edges along this path, then the weight of the matching increases by .

An exact algorithm for a maximization version of an optimization problem on graphs computes a solution with the maximum value of its objective function. For the matching problems considered here, there exist polynomial time algorithms for computing a maximum weighted matching. However, since the time complexity of these algorithms is high, recent work has focused on developing approximation algorithms for these problems that run in nearly linear time in the number of edges in the graph. An approximation algorithm for a maximization problem computes a solution such that the ratio of the value of the objective function obtained by the approximation algorithm to that of the exact algorithm is bounded by a constant or a function of the input size, for all graphs that could be input to the problem. An upper bound on this ratio over all graphs is the approximation ratio of the algorithm. The approximation algorithms that we design for the MVM problem satisfy approximation ratios of or .

Hopcroft and Karp [16] showed that if is a matching in a graph such that a shortest augmenting path has length at least edges, then is a -approximation matching for a maximum cardinality matching.

Recent work has focused on developing several approximation algorithms for MEM that run in time linear in the number of edges in the graph. A number of -approximation algorithms are known for MEM, including the Greedy algorithm, the Locally Dominant edge algorithm, and a Path-growing algorithm [32, 8]. Currently the practically fastest -approximation algorithm is the Suitor algorithm of Manne and Halappanavar [24], which employs a proposal based approach similar to algorithms for the stable matching problem. This algorithm has a higher degree of concurrency since vertices can be processed in any order to extend proposals to their eligible heaviest neighbors, since proposals can be annulled. The parallel Suitor algorithm has parallel depth and work when the edge weights are chosen uniformly at random (here is the maximum degree of a vertex) [21]. The Locally Dominant edge algorithm and the Suitor algorithm have also been implemented on multi-threaded parallel architectures [15, 24].

In practice, some of the -approximation algorithms compute matchings with or more of the weight of a maximum edge-weighted matching for many graphs. In addition, -approximation algorithms have also been designed for MEM  [9, 29]. These algorithms are slower than the -approximation algorithms and do not improve the weight of the matching by much in practice [25].

More recently, for any , a -approximation algorithm for MEM with time complexity has been designed by Duan and Pettie [10]. This algorithm is based on a scaling-based primal-dual approach, requires the computation and updating of blossoms for non-bipartite graphs, and is more expensive than the simpler -approximation algorithms. We will show in the Results section that this algorithm is slower than the -approximation algorithm considered in this paper, while surprisingly computing matchings of lower weight. The Duan-Pettie paper surveys earlier work on exact and approximation algorithms for the MEM problem. Hougardy [17] has also provided a recent survey of developments in approximation algorithms for matchings. MEM problems arise in sparse matrix computations (permuting large elements to the diagonal of a sparse matrix) [12], network alignment [18], scheduling problems, etc.

Approximation algorithms have now been designed for several problems related to matching: maximum vertex-weighted matching, maximum edge-weighted matching, maximum edge-weighted -matching [20, 18], the minimum weight edge cover, and the minimum weight -edge cover problem [19]. Approximation is a paradigm for designing parallel algorithms for these problems, and such algorithms has been shown to have good parallel performance.

3 Characterization of Maximum Vertex Weighted Matchings

In this section we characterize an MVM two different ways: First, in terms of augmenting paths and increasing paths, and second, in terms of the weights in the matching.

If all the vertex weights are positive, then any maximum vertex weighted matching is a maximum cardinality matching as well. If some of the vertex weights are zero, then without loss of generality, we can choose a maximum vertex weighted matching to have maximum cardinality also as shown below.

Lemma 1

Let be a graph and be a non-negative weight function. There is a maximum vertex-weighted matching that is also a maximum cardinality matching in .

Proof

Consider what happens to the vertices when we augment a vertex-weighted matching by an augmenting path. Both endpoints of the augmenting path are now matched (these were previously unmatched), and all interior vertices in the path continue to remain matched. (Figure 1 illustrates this.) Thus in an algorithm that computes vertex-weighted matchings solely by augmentations, once a vertex is matched it is never unmatched, and it will be matched at every future step in the algorithm. (We call this the “once a matched vertex, always a matched vertex” property of augmentations of a vertex weighted matching.) This implies that if the weights are non-negative, each augmentation causes the weight of a matching to increase or stay the same. Thus we can always choose an MVM to have maximum cardinality of edges.

Of course, the set of matched edges and unmatched edges are exchanged along an augmenting path, so there is no corresponding “once a matched edge, always a matched edge” property. Note also that when we use an increasing path between two vertices and to increase the weight of a matching, then the vertex gets matched, gets unmatched, and all interior vertices in the path continue to be matched. (Again, Figure 1 provides examples.) Hence the property of “once a matched vertex, always a matched vertex” is not true of an algorithm that uses increasing paths during its execution.

If some of the vertex weights are negative, we can transform the problem so that we need consider only nonnegative weights as shown in Spencer and Mayr [34]. For each vertex with a negative weight, we add a new vertex , an edge , with the weight of set to the absolute value of the weight of , and the new weight of set to zero. An MVM in the transformed graph leads to an MVM in the original graph; however, this transformation might not preserve approximations. From now on, we assume that all weights are non-negative.

We turn to the first of our characterizations of a MVM.

Theorem 1

Let be a graph and be a non-negative weight function. A matching is an MVM that also has maximum cardinality if and only if (1) there is no -augmenting path in , and (2) there is no -increasing path in .

Proof

A matching has maximum cardinality if and only if there is no augmenting path with respect to it [28]. Hence we need to prove only (2).

For the only if part, if there were an -increasing path , then the symmetric difference would yield a vertex-weighted matching of larger weight, contradicting the assumption that has maximum vertex weight. For the if part, consider a maximum vertex-weighted matching and a matching that does not have an augmenting path or increasing path with respect to it, . We will show that has the same weight as . The symmetric difference consists of cycles and paths. A cycle consists of vertices matched by both matchings, and hence cannot account for any difference between them in weight. Every path must have even length, and an equal number of edges from and , for otherwise we would be able to augment one of the two matchings, and we need consider only increasing paths. By our assumption, does not have an increasing path with respect to it. But there cannot be an increasing path with respect to either, for such a path would enable us to increase its weight by the symmetric difference .

This Theorem was proved by Tabatabaee et al. [35], who seem to restrict the result to bipartite graphs.

Now we characterize an MWM in terms of the weights. Given a matching we can define a weight vector that lists the weights of all the vertices matched by in non-increasing order. A weight is listed multiple times if there is more than one vertex with the same weight. We can compare the weight vectors of two matchings and in a graph lexicographically: we define the weight vector if , where is the first value in not equal to the corresponding value in . This definition can compare matchings of different sizes in the same graph, since the matching with fewer edges can be augmented with zeros.

Theorem 2

Let be a graph and be a non-negative weight function. A matching is an MVM if and only if its weight vector is lexicographically maximum among all weight vectors.

Proof

For the only if part, let denote a maximum vertex-weighted matching and denote a matching whose weight vector is lexicographically maximum. By our choice, the matching has maximum cardinality. Similarly, also has maximum cardinality, for otherwise we could augment the matching to a maximum cardinality matching while keeping all of the matched vertices in matched in the augmented matching, due to the once-matched, always-matched property of augmentations. Hence suppose that the matching has weight less than the weight of , and that the weight vector is not lexicographically maximum.

Let the first lexicographic difference between the vectors and correspond to a vertex that is matched in and unmatched in . Now consider the symmetric difference of the two matchings . Since both matchings have the same maximum cardinality, the symmetric difference consists of cycles or paths of even length in which edges from the two matchings alternate. As stated earlier, a cycle cannot contribute to the difference in the weights between the matchings. Among the alternating paths, there is one path of even length whose one endpoint is the vertex that is matched in but not in . Denote the other endpoint of this path by . Since the path has even length, is matched in but not in . Also since the first lexicographic difference between the vectors and occurs at , and but not is lexicographically maximum, the weight of is greater than or equal to the weight of . The matching would increase the weight of the maximum vertex-weight matching if the two weights were unequal. Hence these two weights are equal, and we obtain a contradiction to our assumption that this was the first weight where the weight vectors of the two matchings were different.

The proof of the if part is in [27], and we include it here for completeness. Again let denote a maximum vertex-weighted matching, and let denote a matching whose weight vector is lexicographically maximum. The symmetric difference consists of cycles and paths. As stated earlier, vertices in alternating cycles cannot contribute to the differences in the weight vectors. Now the matching must have maximum cardinality since otherwise we could augment it and get a lexicographically larger weight vector. Since both matchings and have maximum cardinality, there is no augmenting path with respect to either matching. Hence each path in the symmetric difference must have even length. Let be one endpoint of one such path that is matched in and unmatched in , and denote the other endpoint of that is matched in and unmatched in . Since has the lexicographically maximum weight vector, we can only have . Hence this is an increasing path and by replacing the matched edges in on the path by the edges in on , we could increase the weight of the maximum vertex-weighted matching . This contradiction proves the result.

The structural properties in these results facilitate the design of two classes of algorithms for MVM. One approach is to compute a maximum weighted matching from an empty matching, augmenting the matching by one edge in each iteration of the algorithm. By choosing to match vertices in decreasing order of weights, and by choosing a heaviest unmatched vertex reachable from , we can ensure that an increasing path with respect to the current partial matching does not exist in the graph. We call this the direct approach. The second, speculative approach, would begin with any matching of maximum cardinality, and increase the weight by means of increasing paths, until a matching of maximum weight is reached.

There are advantages associated with each of these approaches. The direct approach, together with recursion, has been employed by Spencer and Mayr [34] to design an algorithm for MVM. The speculative approach could be efficient in combination with the Gallai-Edmonds decomposition [23]. This decomposition identifies a subgraph which has a perfect matching in any maximum cardinality matching; in such a subgraph, any maximum cardinality matching is a maximum vertex-weighted matching as well. Thus we need solve an MVM only in the remainder of the graph. If the subgraph with the perfect matching is large, there could be substantial savings in run-time. Since an MCM can be computed practically much faster than an MVM, and the Gallai-Edmonds decomposition can be obtained in linear time from an MCM, this approach might be practically useful.

4 An Exact Algorithm for MVM

In this Section, we describe an algorithm that solves the MVM problem exactly, primarily to show how our -approximation algorithm can be derived from it in a natural manner. In Algorithm MATCHD (see the displayed Algorithm 1, we describe how an MVM is computed by matching vertices in non-increasing order of weights. Here is the set of unmatched vertices, and in each iteration the algorithm attempts to match a heaviest unmatched vertex . From , the algorithm searches for a heaviest unmatched vertex it can reach by an augmenting path . If it finds , then the matching is augmented by forming the symmetric difference of the current matching with , and the vertices and are removed from the set of unmatched vertices. If it fails to find an augmenting path from , then is removed from the set of unmatched vertices, since we do not need to search for an augmenting path from again. When all the unmatched vertices have been processed, the algorithm terminates.

1:procedure MatchD()
2:     
3:     
4:     while  do
5:         
6:         
7:         Find an augmenting path from that reaches
a heaviest unmatched vertex ;
8:         if   then
9:              
10:              ;
11:         end if
12:     end while
13:end procedure
Algorithm 1 Input: A graph with weights on the vertex set . Output: A vertex-weighted matching of maximum weight. Effect: Computes a maximum vertex-weight matching in the graph .

To prove the correctness of the algorithm, we need the following Lemma.

Figure 1: Construction used in the proof of Lemma 1. In each case, is a matching, is an -augmenting path between , a heaviest -unmatched vertex and , a heaviest unmatched vertex reachable by an -alternating path from , and . Matched edges are drawn as solid edges, and unmatched edges are drawn as dashed edges. If there is no -increasing path in , then there is no -increasing path as well.
Lemma 1

Let be an unmatched vertex with respect to a matching in a graph , and let be a weight function on the vertex set . Suppose that there does not exist an -augmenting path from the vertex , and also that there is no -increasing path (from any vertex) in the graph . Let be an -augmenting path from a heaviest unmatched vertex , whose other endpoint is a heaviest unmatched vertex that can be reached from by an -alternating path. If , then there does not exist an -augmenting path from the vertex , nor an -increasing path (from any vertex) in the graph .

Proof

When is an augmenting path from some -unmatched vertex , clearly has to be distinct from the vertex since from the latter, there is no augmenting path by the condition of the Lemma. A proof that there is no -augmenting path from can be found in [28]. Hence we prove that there is no -increasing path in . (Similar arguments will be made several times in this paper.)

If there is no -reversing path in , then there cannot be any -increasing path, and we are done. Hence choose an arbitrary -reversing path that joins an -unmatched vertex and an -matched vertex . Since every vertex on the -augmenting path is matched in , the vertex cannot belong to , while the vertex can belong to and does not need to be distinct from the vertices or . We will prove that and hence that the path is not -increasing.

If an -reversing path also joins the vertices and , where is -unmatched and is -matched, then since there is no -increasing path in , we have . If no -reversing path joins and , then the paths and cannot be vertex-disjoint; for if they were, then would also be an -reversing path, which we assumed does not exist in . Thus the paths and share at least one common vertex, and indeed, as we show now, it shares a matched edge. For, every vertex on the path is -matched, and hence a vertex in in that is adjacent to a vertex in must have the edge as an -unmatched edge. Since is an -alternating path, the next edge on the path must be a matched edge incident on the vertex , and hence this matched edge is common to both paths and . (The paths and could intersect more than once.)

Now we have two cases to consider.

The cases are illustrated in Fig. 1. In the first case, there is an -augmenting path between and , and there are two subcases: either and are the same vertex, or there is an -reversing path between and . The second subcase corresponds to Subfigure (1). Now the path cannot be an -increasing path by our assumption that no such path exists in . Hence in both subcases, we can write . Since we chose the path to begin at and end at the -unmatched vertex and not at the -unmatched vertex , we have . Combining the two inequalities, we obtain .

In the second case, there is an -augmenting path between and , and again there are two subcases: either and are the same vertex, or there is an -reversing path between and . The second subcase is illustrated in Fig. 1 (2). As before, the path cannot be -increasing by supposition, and therefore . Since is a heaviest -unmatched vertex by choice, and is -unmatched, we have . Combining, we have .

Theorem 1

Algorithm MatchD computes an MVM in a graph with vertex weights given by a function .

Proof

Let be the matching computed by Algorithm MatchD. We show by induction that there does not exist an -augmenting path nor an -increasing path in the graph .

Let be the number of augmenting operations in the Algorithm MatchD. The matching is the last in a sequence of matchings , for , , , , computed by the algorithm. For , let denote the -augmenting path used to augment to the matching , and let denote the source of the augmenting path (the -unmatched vertex from which we searched for an augmenting path), and let denote its other end point. The induction is on the matching , and the inductive claim is that
(1) there is no -augmenting path from an unmatched vertex that has already been processed, i.e., a vertex from which we have searched for an augmenting path earlier and have failed to find one, and
(2) there is no -increasing path from any vertex in .

The basis of the induction is , when the result is trivially true. The first condition holds because no vertices have been processed yet, and the second condition holds since the matching is empty and hence there is no increasing path. Hence assume that the claim is true for some , with . Now the result holds for the step by applying Lemma 1.

The time complexity of this algorithm is . We seek to match each vertex, and the search for augmenting paths from each vertex costs time. The second term is the cost of sorting the vertex weights.

Additionally, we can describe an exact algorithm for MVM that takes the speculative approach. Here one computes first a maximum cardinality matching, and then searches for increasing paths from unmatched vertices, in decreasing order of weights, to obtain an MVM. We need additional results to show that this algorithm computes an MVM. The time complexity of the algorithm is the same as the one using the direct approach described in this Section. Practically, the performance of the two classes of algorithms could be quite different, and hence it is worthwhile to implement these algorithms. However, since our interest in this paper is on a -approximation algorithm for MVM in bipartite graphs, we do not discuss this further here.

5 A -Approximation Algorithm for MVM in Bipartite Graphs

In this Section, we restrict ourselves to bipartite graphs. In order to solve the MVM on a bipartite graph , we create two ‘one-side weighted’ subproblems from the given problem. In the first subproblem, the weights on the vertices are set to zero, and in the second subproblem, the weights on the vertices are set to zero. We compute MVMs on the two subproblems, and then combine them, using the Mendelsohn-Dulmage theorem, to obtain a solution of the original problem. In this section, we describe the algorithm, and compute its time complexity. We defer the proof of correctness of the algorithm to the next section, since it is somewhat lengthy.

Theorem 3 (Mendelsohn-Dulmage)

[26] Let be a bipartite graph, and let and be two matchings in . Then there is a matching such that all -matched vertices in are matched in , and all -matched vertices in are also matched in .

The matching is obtained by a case analysis that considers the symmetric difference of and , and a proof is included in Section of Lawler ([22]).

5.1 The Approximation Algorithm

The Approximation Algorithm (displayed in Algorithm 2) calls a Restricted Bipartite Matching algorithm (in turn displayed in Algorithm 3) which solves a one-side weighted MVM in a bipartite graph. The latter algorithm matches unmatched vertices (in the weighted vertex part) in decreasing order of weights. From each unmatched vertex , the algorithm searches for an unmatched vertex (it is unweighted) by a shortest augmenting path of length at most three. If it finds a short augmenting path, then the matching is augmented by the path; if it fails to find such a path, then we do not consider the vertex again in the algorithm.

After solving the two Restricted Bipartite Matching problems, the algorithm invokes the Mendelsohn-Dulmage theorem to compute a final matching in which the matched vertices from the weighted part of each problem are included. We will prove that this algorithm computes a -approximation to the MVM, and that it can be implemented in time.

1:procedure Bipartite-TwoThird-Approx(.)
2:     
3:     
4:     MendelsohnDulmage
5:end procedure
Algorithm 2 Input: A bipartite graph with weights on the vertices. Output: A matching . Effect: Computes a -approximation to a maximum vertex-weighted matching.
1:procedure Restricted-Bipartite-Match()
2:     
3:     
4:     while  do Compute
5:         
6:         
7:         Find a shortest augmenting path of length at most starting at
8:         if   then
9:              
10:         end if
11:     end while
12:end procedure
Algorithm 3 Input: A bipartite graph with weights only on one vertex part . Output: A matching . Effect: Computes a -approx to a maximum vertex-weight matching in a bipartite graph.

5.2 Time Complexity of the -Approximation Algorithm

Theorem 4

The -approximation algorithm has time complexity , where is the maximum of and , and is the number of edges in the bipartite graph .

Proof

We will establish the time complexity for the restricted bipartite graph with nonzero weights on the vertices. An identical result holds for the graph with nonzero weights on the vertices. The cost of computing the final matching via the Mendelsohn-Dulmage theorem is , since it needs to work with only the symmetric difference of the two matchings. The complexity comes from the sorting of the weights on the vertices in decreasing order.

In each iteration of the while loop, we choose an unmatched vertex , and examine all neighbors of . If we find an unmatched vertex , then we can match the edge and we proceed to the next iteration. In this case, when an augmenting path of length one suffices to match , the time complexity is proportional to the degree of , and hence summed over all unmatched vertices this is .

Now we consider alternating paths of length (edges) from ; we search for an augmenting path among these. Denote the number of such paths from by . Let us denote a generic alternating path of length three by , , , , for , , , . Furthermore, suppose one of these paths, , , , is an augmenting path. After augmentation, we have the two matched edges and and the unmatched edge .

The cost of examining the neighbors of the unmatched vertices is clearly . Once we reach a matched neighbor of , then we take the matched edge , and then search the neighbors of for an unmatched neighbor. Consider the neighbors of the vertex . If we find an unmatched neighbor , then we have an augmenting path, we match the edge , and we end the search. Once a neighbor of is matched, since it stays matched for the rest of the algorithm, we need not examine it again. If we find a matched neighbor of , then it cannot lead to an augmenting path of length three, and we can examine the next neighbor in the adjacency list. At each step, we can maintain a pointer to the first unexamined neighbor of in the adjacency list of in the algorithm, and continue the search for an unmatched neighbor of from that vertex. This means that we go through the adjacency list of any matched vertex in at most once, and thus the cost of searching these vertices at a distance two edges from unmatched vertices in is .

This completes the proof.

6 Correctness of the -Approximation Algorithm

Figure 2: The three cases for augmenting paths of length one or three. In each case, is a matching, is an -unmatched vertex belonging to the vertex set , is an -augmenting path that joins (a heaviest -unmatched vertex from ) and (an arbitrary vertex in that can be reached by an -augmenting path of length one or three from ). The augmented matching . Matched edges are drawn as solid edges, and unmatched edges are drawn as dashed edges. In all three cases, the presence of an -augmenting path of length three from the vertex implies the presence of an -augmenting path of length one or three from as well.
Figure 3: The three cases for increasing paths of length two. In each case, is a matching, and is an -augmenting path that joins (a heaviest -unmatched vertex from ) and (an arbitrary vertex in that can be reached by an -augmenting path of length one or three from ). The augmented matching . Matched edges are drawn as solid edges, and unmatched edges are drawn as dashed edges. In all three cases, the absence of an -increasing path of length two joining the vertices and implies the absence of an -increasing path of length two as well.

The following is the result we wish to prove in this Section.

Theorem 5

Let be a bipartite graph and a weight function. Then Algorithm 2 computes a -approximation for the MVM problem.

This is technically the most demanding section in this paper, and the reader could skip it in a first reading of the paper without loss of understanding. In order to prove this result, we need two supplementary results.

Lemma 2

Let be a bipartite graph, and be a weight function such that for every vertex . Let be a matching in , and an -unmatched vertex. Suppose that (i) there is no -augmenting path of length one or three from the vertex , and that (ii) there is no -increasing path of length two in . Let denote an -augmenting path of length at most three with one end point , a heaviest -unmatched vertex, and let the other endpoint of be . If denotes the augmented matching, then (i) there is no -augmenting path of length one or three from the vertex , and (ii) there is no -increasing path of length two in .

Proof

We will first consider the case of augmenting paths from the vertex of the specified length, and then consider increasing paths of length two.

Since there is no -augmenting path of length one from the -unmatched vertex , all neighbors of are matched under . Since is an augmenting path, all vertices matched in continue to be matched in the matching , and thus there cannot be an -unmatched edge incident on , and thus no -augmenting path of length one from .

Now suppose that there exists an -augmenting path of length three from the vertex . Since no such path exists with respect to the matching , the augmenting path must have some vertex adjacent to the vertex . There are three possible cases for the -augmenting path of length one or three that joins a heaviest -unmatched vertex to some vertex . The three cases are illustrated in Figure 2.

In the first case, , in the second case , and in the third case, . In all three cases we can see that the existence of an -augmenting path of length one or three from implies the existence of an -augmenting path of length one or three from as well. This contradiction proves the result regarding short augmenting paths.

We turn to increasing paths of length two. Again, we suppose that there is an -increasing path of length two joining two vertices belonging to denoted by and . There are three cases to consider as illustrated in Figure 3.

In the first case, since and are both -unmatched, by choice of as a heaviest unmatched vertex, we have . Hence the path cannot be -increasing.

In the second case, we have since there is no -increasing path of length two. Since both vertices and are -unmatched, we have . Combining the two inequalities, we have , and again the path is not an -increasing path.

In the third case, we have since both vertices and are -unmatched. Then again, the path cannot be -increasing.

The contradictions obtained in all three cases complete the proof for short increasing paths.

To compare the weight of a maximum vertex-weighted matching with another matching , we consider the symmetric difference of these two matchings. The subgraph induced by these two matchings consists of cycles and paths. Each cycle in this subgraph has all of its vertices matched in both matchings, so these do not contribute to the difference in their weights. Consider a path in this subgraph that begins with a vertex that is matched in the optimal matching but not the suboptimal matching . Here we choose to be a vertex that is weighted in the restricted bipartite matching problem. If the path has odd length, then it ends in a vertex also matched in but not in . The vertex belongs to the unweighted vertex part. We call the vertex a failure, for the suboptimal algorithm failed to match it, while the optimal algorithm succeeds in matching it, and since is responsible for the lower weight of the the suboptimal matching .

In the subgraph considered above, we cannot have a path of odd length with both of its terminal vertices belonging to but not , for we could use such a path to augment the optimal matching . If a path beginning with the vertex matched in but not has even length, then it ends in a vertex matched in but not . If , then this path contributes to a lower weight for the suboptimal matching . The approximation algorithm we have described does not permit the existence of increasing paths, and so we do not need to consider this here. We also cannot have , for then we would have an -increasing path, contradicting the optimality of .

We now focus on the vertices we have called failures. The idea is to show that failures are light and rare relative to other vertices matched in the suboptimal matching, so that we can compensate for the failures through these vertices. For every failure, if we have a sufficiently large set of compensating vertices , and these sets of vertices are disjoint, then we can establish an approximation ratio for the suboptimal matching.

Lemma 3

Let be a graph, be a weight function, an MVM in , any other matching, and a positive integer. If for every failure , there is a vertex-disjoint set of -matched vertices such that for all , then the matching is an -approximate solution for the MVM problem.

Proof

Enumerate the failures as , , , , and the set of compensating vertices for as , for , , . We can assume that all the compensating vertices are matched in and , which corresponds to the worst-case scenario for the approximation ratio.

We consider the inequalities that state that failures are light relative to their compensating vertices, for , , , and sum them over , to obtain

We add to both sides to obtain

Note that the left-hand-side of the inequality counts the weight of some of the matched vertices in the optimal matching , and the right-hand-side counts the weight of some of the matched vertices in the suboptimal matching . We sum this last inequality over all failures:

The vertices not included on either side of this inequality are vertices that are matched in both matchings. We add times the sum of the weights of these latter vertices to the left-hand-side and times this sum to the right-hand-side of the inequality, and obtain

Rearranging, we find

We need to make an argument to charge the weight of a failed vertex to the set of compensating vertices, since these vertices are found from an alternating path constructed from the optimal matching and the current matching in the approximation algorithm. As the latter matching changes, the vertices on the alternating path from a failed vertex can change as well. The vertices on the alternating path for the failure at a current step in the approximation algorithm might already have been charged for earlier failures, and hence we need a careful counting argument to find the set of compensating vertices to charge for a failure.

Proof of Theorem 5

Proof

Recall that we solve the problem by solving two separate matching problems, one with weights only on the vertices in and the second, with weights only on the vertices in . Using the Mendelsohn-Dulmage Theorem then we combine the two matchings to find a matching that matches all the matched vertices in in the first matching, and all the matched vertices in from the second matching.

Let us consider the matching problems with weights on the vertex set . Let be the matching computed by the Approximation algorithm, and be a matching of maximum vertex weight. We consider failures, i.e., vertices in that are matched in the optimal matching but not in the approximate matching. We will show that every failure is compensated by two vertices in that are matched by and are also heavier than the failed vertex. These sets of compensating vertices are vertex-disjoint, and this leads to the Two-third approximation.

The Approximation algorithm considers vertices to match by non-increasing order of weights. If a short augmenting path (of length one or three) is found from an unmatched vertex , then the algorithm augments the matching, and is matched. If the Algorithm fails to find a short augmenting path from , then it does not search for an augmenting path from the vertex again. At the end of this step, we will say that the vertex has been processed.

Let be the number of short augmenting operations in the Approximation algorithm, and let the matchings in the sequence of short augmentations be indexed as , for , , . For , let denote the augmenting path used to augment the matching to , and let denote the source of the augmenting path and denote its destination.

First, we induct on the augmentation step to show that:
(1) no -augmenting path of length one or three exists from any vertices that are -unmatched and have been processed prior to this augmentation step.
(2) no -increasing path of length two exists in .

The base case is , and these results hold trivially since no vertex has been processed yet, and no vertices are matched. If the induction hypothesis holds at the beginning of augmentation step , then by Lemma 2 the hypothesis holds at the beginning of step as well, since we match a currently heaviest unmatched and unprocessed vertex at this step.

When a vertex is marked as a failure (i.e., as a vertex that has been processed and is -unmatched), then the length of any augmenting or increasing path is at least four. We will make use of this fact in a second inductive argument.

We enumerate the failures in order of their processing time: the vertex is the -th failure, and denotes the total number of failures. The second inductive argument is on the number of failures , where . Denote the matching at this step by (the matching associated with the -th failure). At step , we consider all failures up to this point, including .
Claim: For every failure with , there are two -matched vertices in labeled and that are heavier than . Hence , and .

We prove the Claim by induction again. The base case of the induction hypothesis is . Consider the situation when the vertex is processed and is marked as a failure. The current matching is , and we consider the symmetric difference . The vertex is an endpoint for an alternating path in the subgraph induced by the edges in the symmetric difference (the edges belong alternately to the matching and ), and its length is at least four. Denote the vertex at distance two from by and the vertex at distance four from by . These vertices are matched in and hence were processed earlier than , and are hence at least as heavy as . Thus the induction hypothesis holds for .

Assume that the induction hypothesis is true for some , and consider the case for , when a vertex is processed and becomes a failure. The current matching is , and by forming the symmetric difference we see that every failure with , , is an endpoint for an alternating path , whose length is at least four (by the first inductive argument). Denote the vertices at distances two and four from by and , respectively.

When the graph is bipartite (which is the case here), we claim that these vertices form distinct pairs. Consider an alternating path from a failed vertex when edges are chosen from . The vertex is matched in the optimal matching but not in the current matching , and it belongs to . Every vertex from reached by this alternating path is reached by an edge that belongs to the optimal matching, while every vertex in reached by this path (other than ) is reached by an edge that belongs to the current matching. Hence it is clear that this path cannot reach another failed vertex , since such a vertex is not matched in the current matching. Thus these sets of alternating paths from the failed vertices are vertex-disjoint.

Define , and . (The first set consists of vertices we find from each failure using alternating paths from the optimal matching and the current matching. The second set consists of the vertices that have been charged for the failures prior to the current step.) Then and since these elements are distinct. The set is not necessarily contained in the set , and hence . Thus we can choose two distinct vertices from the set (and matched in ) to associate with the failure . Denote them by and . Since these vertices are processed earlier than , they are at least as heavy as . Thus the induction hypothesis holds for the -st failure also.

Now we can apply Lemma 3 with to obtain the -approximation bound.

This completes the proof.

7 A -Approximation Algorithm for MVM in Bipartite Graphs

In this section we discuss a Greedy -approximation algorithm for the vertex-weighted matching problem on bipartite graphs. Our intent is to compare the -approximation algorithm from the previous Section with this algorithm, and hence our discussion will be brief. Also, the algorithm discussed here could be adapted to non-bipartite graphs in a straightforward manner. The reason we discuss the bipartite version here is that the specialized algorithm for bipartite graphs is more efficient practically. The algorithm solves two Restricted Bipartite Matching (one-side weighted) problems and then invokes the Dulmage-Mendelsohn theorem as in the -approximation algorithm.

The Greedy -approximation algorithm has only one change from Algorithm 3. In each iteration of the while loop, it finds an unmatched neighbor of the currently heaviest unmatched vertex (an augmenting path of length instead of ). Recall that since only one vertex set in the bipartite graph is weighted in the Restricted Bipartite Matching problem, it can choose any unmatched neighbor of , and does not need to look for the heaviest such neighbor.

The following Lemma and Theorem show that this Greedy algorithm is a -approximation algorithm for the VWM problem on bipartite graphs.

Lemma 2

Let be a bipartite graph, and be a weight function such that for every vertex . Let be a matching in and be an -unmatched vertex. Suppose that (i) there is no -augmenting path of length one from , and (ii) there is no -increasing path of length two in . Let denote an unmatched edge, where is a heaviest -unmatched vertex and . If denotes the augmented matching, then (i) there is no -augmenting path of length one from the vertex , and (ii) there is no -increasing path of length two in .

The proof of this Lemma is similar to Lemma 2, and hence is omitted.

Theorem 6

Let be a bipartite graph and a weight function. Then the Greedy algorithm computes a -approximation for the MVM problem on .

The proof is by induction on the number of augmentations, using Lemma 2 at each augmenting step, and is similar to the proof of Theorem 1. It is again omitted.

8 Experiments and Results

8.1 Experimental Setup

For the experiments, we used an Intel Xeon E5-2660 processor based system (part of the Purdue University Community Cluster), called Rice. The machine consists of two processors, each with ten cores running at 2.6 GHz (20 cores in total) with MB unified L3 cache and 64 GB of memory. The operating system is Red Hat Enterprise Linux release 6.9. All code was developed using C++ and compiled using the g++ compiler (version: ) using the -O3 flag.

Graph Degree Degree
Max. Mean Max. Mean
Trec10 106 134 81.25 478 79 18.02 8,612
IG5-16 18,485 990 31.83 18,846 120 31.22 588,326
fxm3_16 41,340 57 9.49 85,575 36 4.58 392,252
JP 67,320 8,980 204.02 87,616 390 156.76 13,734,559
flower_8_4 55,081 15 6.81 125,361 3 2.99 375,266
spal_004 10,203 6,029 4524.96 321,696 168 143.52 46,168,124
pds-50 83,060 96 7.11 275,814 3 2.14 590,833
image_interp 120,000 6 5.93 240,000 5 2.97 711,683
kneser_10_4_1 330,751 3 3.00 349,651 16 2.84 992,252
12month1 12,471 75,355 1814.19 872,622 3,420 25.93 22,624,727
IMDB 428,440 1,334 8.83 896,308 1,590 4.22 3,782,463
GL7d16 460,261 114 31.48 955,128 64 15.17 14,488,881
wheel_601 723,605 3 3.00 902,103 602 2.41 2,170,814
Rucci1 109,900 108 70.89 1,977,885 4 3.94 7,791,168
LargeRegFile 801,374 655,876 6.17 2,111,154 4 2.34 4,944,201
GL7d20 1,437,547 395 20.79 1,911,130 43 15.64 29,893,084
GL7d18 1,548,650 69 22.98 1,955,309 73 18.20 35,590,540
GL7d19 1,911,130 121 19.53 1,955,309 54 19.09 37,322,725
relat9 549,336 227 70.91 12,360,060 4 3.15 38,955,420
Table 1: The set of bipartite graphs which are our test problems. The problems are listed in increasing order of the total number of vertices.

Our test set consists of nineteen real-world bipartite graphs taken from the University of Florida Matrix collection [6] covering several application areas. We chose the largest rectangular matrices in the collection, and then added a few smaller matrices. Table 1 gives some statistics on our test set. The bipartite graphs are listed in increasing order of the total number of vertices. The largest number of vertices of any graph is nearly million, and the largest number of edges is million. For each vertex set in the bipartition we list the cardinality of the set, and the maximum and average vertex degrees. The average degree varies from two to four thousand, and hence the graphs are diverse with respect to their degree distributions. The weights of the vertices were generated as random integers in the range . We compare the performance of six different exact and approximate matching algorithms. Two of these are an exact maximum edge-weighted matching algorithm (MEM), and an exact maximum vertex-weighted matching algorithm (MVM). Two are the - and -approximate MVM algorithms based on finding short augmenting paths that we have discussed in this paper. The last two algorithms are obtained from the -approximation algorithm for MEM based on the scaling approach of Duan and Pettie, where we have chosen equal to and to obtain - and -approximation algorithms.

The MEM algorithm is a primal-dual algorithm for sparse bipartite graphs with time complexity [13], which has been implemented in the Matchbox software by our research group. We apply the MEM algorithms to the vertex-weighted matching problems by assigning to each edge the sum of the weights of its endpoints. The Exact MVM algorithm we have implemented is Algorithm 1, and not the Spencer and Mayr algorithm, for the following reasons: The former algorithm is easy to implement, and has good performance, while the latter is more complicated to implement. As can be seen from the earlier work on matchings discussed in Section 1, asymptotically fastest algorithms are not necessarily the fastest algorithms in practice. Finally, our focus in this paper is on the approximation algorithms.

Graph Exact Algorithms Appr. Algorithms
Aug. Path Approach Scaling Approach
absolute 2/3-Appr. 1/2-Appr. 2/3-Appr. 5/6-Appr.
weight relative weight
Trec10 1.43E+04 0.999 0.988 0.942 0.984
IG5-16 1.16E+07 0.987 0.933 0.906 0.928
fxm3_16 5.06E+07 0.995 0.963 0.962 0.971
JP 3.42E+07 0.989 0.956 0.924 0.956
flower_8_4 6.99E+07 0.990 0.963 0.958 0.969
spal_004 1.51E+07 1.000 0.996 0.880 0.923
pds-50 1.06E+08 0.996 0.980 0.953 0.968
image_interp 1.48E+08 0.993 0.965 0.933 0.946
kneser_10_4_1 3.36E+08 0.996 0.960 0.962 0.964
12month1 1.82E+07 0.999 0.991 0.875 0.921
IMDB 3.04E+08 0.987 0.927 0.927 0.940
GL7d16 5.76E+08 0.995 0.942 0.988 0.994
wheel_601 7.84E+08 0.990 0.903 0.931 0.947
Rucci1 1.62E+08 0.999 0.997 0.918 0.954
LargeRegFile 9.72E+08 0.998 0.979 0.957 0.969
GL7d20 1.61E+09 0.998 0.948 0.990 0.994
GL7d18 1.70E+09 0.993 0.921 0.995 0.996
GL7d19 1.92E+09 0.994 0.926 0.994 0.995
relat9 4.08E+08 1.000 0.999 0.910 0.948
Geom. Mean 1.00 0.995 0.960 0.942 0.961
Table 2: Comparing the weight of the matchings computed by six different algorithms. The Exact MVM and MEM algorithms compute the same matching, and for these we report the absolute values of these quantities. The results of the four approximation algorithms are reported as the ratio of the weight to the weight of the exact algorithms.
Graph Exact Algorithms Appr. Algorithms
Aug. Path Approach Scaling Approach
absolute 2/3-Appr. 1/2-Appr. 2/3-Appr. 5/6-Appr.
cardinality relative cardinality
Trec10 106 1.000 1.000 1.000 1.000
IG5-16 9,519 1.000 1.000 0.953 0.965
fxm3_16 41,340 1.000 0.999 0.994 0.996
JP 26,137 0.994 0.979 0.978 0.980
flower_8_4 55,081 1.000 0.999 0.997 0.998
spal_004 10,203 1.000 1.000 1.000 1.000
pds-50 82,837 1.000 1.000 0.996 0.996
image_interp 120,000 1.000 1.000 0.999 0.999
kneser_10_4_1 323,401 0.999 0.969 0.970 0.974
12month1 12,418 1.000 1.000 0.999 0.999
IMDB 250,516 0.992 0.958 0.955 0.962
GL7d16 460,091 1.000 0.999 1.000 1.000
wheel_601 723,005 1.000 0.930 0.940 0.960
Rucci1 109,900 1.000 1.000 1.000 1.000
LargeRegFile 801,374 1.000 0.999 0.998 0.999
GL7d20 1,437,546 1.000 0.992 0.999 1.000
GL7d18 1,548,499 1.000 0.974 0.999 0.999
GL7d19 1,911,130 0.998 0.920 0.965 0.971
relat9 274,667 1.000 1.000 1.000 1.000
Geom. Mean 1.00 0.999 0.985 0.986 0.989
Table 3: Comparing the cardinality of the matchings computed by six different algorithms. The Exact MVM and MEM algorithms compute the same matching, and for these we report the absolute values of these quantities. The results of the four approximation algorithms are reported as the ratio of the cardinality to cardinality of the exact algorithms.

We compare the weights computed by the six algorithms in Table 2. Both exact MEM and MVM algorithms compute the same matching, and hence we report one set of weights and cardinalities for these algorithms. We report the absolute weight obtained by the exact algorithms, and for the approximation algorithms report the fraction of the maximum weight obtained by them. We report the cardinality of the matchings computed by the six algorithms in Table 3. The results are reported in a format similar to that for the weights.

Ten of these graphs have their MVM corresponding to -perfect matchings, i.e., the cardinality of the MVM is equal to the cardinality of the smaller vertex set . There are also four graphs where the cardinality is lower than almost half the value of : IG5-16, JP, IMDB, and relat9. All of the four approximation algorithms compute weights that are higher than of the maximum weight obtained by the exact algorithm (with two exceptions), much higher than the guaranteed approximation ratios (, , or ). When we consider the geometric means, the -approximation MVM algorithm obtains of the weight, and of the cardinality of the maximum weight matching. The -approximation MVM algorithm obtains values that are lower, for the weight and for the cardinality. The scaling algorithms perform worse than the augmenting path-based approximation algorithms: even the -approximation scaling algorithm obtains only of the weight and of the cardinality, values that are lower than the -approximation MVM. The relative weights of the matchings computed by three of the approximation algorithms are plotted in Figure 4. The problems are listed in order of increasing relative weight of the -approximation MVM algorithm.

We compare the run-times of the Exact MEM, Exact MVM, and the four approximation algorithms in Table 4. The time (in seconds) taken by the Exact MEM algorithm to compute the maximum weight matching is reported; for the other five algorithms, we report the relative performance, which is the ratio of the run-time of the MEM algorithm to the run-time of each of the other algorithms. Thus the values in the Table are proportional to the reciprocal running time, and higher the value, the faster the algorithm.

The Exact MEM algorithm is fast for the smaller problems, but as the number of vertices and edges gets into the tens of millions, it can require more than hours (on graph GL7d18) to compute the matching. The maximum time needed by the Exact MVM algorithm on any graph is minutes for GL7d18 again; the -approximation MVM algorithm takes the maximum time of seconds on the GL7d19 graph; the scaling algorithms can take 1 minute for relat9 (-approximation) and 40 seconds (-approximation) for the same problem. In terms of geometric means, the exact MVM algorithm is times faster than the exact MEM algorithm, while the -approximation MVM is times faster, and the -approximation algorithm is times faster, both relative to the Exact MEM algorithm. The scaling algorithms are only times faster (-approximation) and times faster (-approximation) than the exact MEM algorithm. Note also that in general the run-times increase with the size of the graph, but they also depend on how the edges are distributed within the graph.

These results are plotted in Figure 5, where the -axis is in logarithmic scale. Note that for the four largest graphs (the three GL7d graphs and relat9), the -approximation MVM algorithm is more than times faster than the Exact MEM algorithm. The scaling algorithms are slower than it, but they perform relatively better for larger graphs relative to the Exact MEM algorithm, while they are slower for the smaller graphs. The Exact MVM algorithm tracks the -approximation MVM algorithm for smaller graphs, but it is much slower for the larger problems.

The exact algorithms for both MEM and MVM are our implementations, and we spent a reasonable amount of effort to make them efficient; however, these are more sophisticated than the approximation algorithms, which are simpler to implement. Hence it might be possible to make the exact algorithms faster with optimizations that we have not considered, but the lower asymptotic time complexities of the approximation algorithms will make them faster in practice, as our results show.

Figure 4: The weights of the matchings computed by the approximation algorithms relative to the Exact MVM algorithm. The -axis lists the nineteen problems in increasing order of the relative weight obtained by the -approximation MVM algorithm, and the -axis shows the percentage of the weight obtained by three approximation algorithms.
Graph Exact Algorithms Approx. Algorithms
MEM MVM Aug. Path Approach Scaling Approach
2/3-Appr. 1/2-Appr. 2/3-Appr. 5/6-Appr.
Time (s) Relative Performance
Trec10 2.36E-3 16.71 24.17 32.11 1.67 0.87
IG5-16 1.50 17.84 171.82 274.39 14.09 9.54
fxm3_16 6.95E-2 1.58 2.26 3.18 0.42 0.29
JP 46.82 15.75 322.49 604.48 16.50 10.62
flower_8_4 0.83 8.16 18.91 25.92 2.67 1.86
spal_004 53.57 71.87 150.48 218.84 6.36 4.54
pds-50 0.12 0.84 1.32 1.75 0.14 0.10
image_interp 0.13 0.86 1.25 1.61 0.23 0.16
kneser_10_4_1 1.01 2.51 4.36 5.09 0.77 0.54
12month1 28.24 35.90 78.27 117.15 4.70 3.12
IMDB 3.09 3.74 6.88 10.93 0.57 0.37
GL7d16 6.63E+3 98.86 7.39E+3 1.39E+4 725.45 397.21
wheel_601 0.72 0.62 1.17 1.51 0.18 0.12
Rucci1 41.03 21.42 65.23 93.67 6.93 4.82
LargeRegFile 2.08 1.17 1.97 2.79 0.53 0.38
GL7d20 1.85E+4 385.41 5.78E+3 1.37E+4 746.76 482.75
GL7d18 5.02E+4 37.97 1.34E+4 3.43E+4 1.85E+3 1.20E+3
GL7d19 2.57E+4 162.21 5.47E+3 1.53E+4 1.00E+3 689.35
relat9 3.63E+3 85.38 877.71 1.35E+3 87.64 60.44
Geom. Mean 1.00 12.02 62.64 99.66 6.99 4.64
Table 4: The relative performance of the Exact MVM algorithm and the four approximation algorithms relative to the Exact MEM algorithm. We report run-times of the Exact MEM algorithm in seconds; for all others, we report the relative performance, which is the ratio of the runtime of the Exact MEM algorithm to that of the other algorithms. Hence the value for an algorithm shows how fast it is relative to the Exact MEM algorithm.
Figure 5: The relative performance (proportional to the reciprocal run-time) of the Exact MVM algorithm and the four approximation algorithms relative to the Exact MEM algorithm. The -axis lists the nineteen problems in increasing order of the relative performance of the -approximation MVM algorithm, and the -axis shows the relative performance of the algorithms in logarithmic scale.
Random Weights
Metric Original edge edge vertex vertex
summed summed
Time (s) 4.61 E0 1.366 E1 1.330 E1 1.831 E4 1.001 E5
Cardinality 1,437,546 1, 437, 545 1,437,546 1,437,546, 1,437,546
No. augmentations 1,437,546 1, 437, 545 1,437,546 1,437,546 1,437,546
Aug. path lengths
Maximum 9 55 61 181 2383
No. distinct 5 28 30 74 938
Mean 1.009 1.896 1.896 7.126 72.55
No. dual updates 1.350 E7 7.499 E6 7.485 E6 2.728 E10 3.921 E10
Time aug. paths (s) 3.38 9.67 9.72 1.128 E4 4.314 E4
Time dual updates (s) 0.42 0.26 0.27 2.369 E3 4.435 E3
Table 5: The relative performance of the Exact MEM algorithm for four choices of weights for the GL7d20 graph. The first set of weights correspond to the original matrix values, the second and third to random edge weights in the ranges shown, and the fourth to random vertex weights in that are summed to create edge weights. The results show that the fourth choice of weights leads to large runtimes due to the greatly increased length of the augmenting path searches.

The large running times of the exact MEM algorithms on MVM problems is due to the mismatch between algorithm and problem. Adding vertex weights to create edge weights causes edges incident on a vertex to have highly correlated weights, and this increases the average length of the augmenting paths searched in the course of the algorithm. Table 5 shows various metrics for the GL7d20 matrix, one of the matrices for which the MEM algorithm takes a long time. We show what happens to several metrics when we use the original matrix weights, two sets of random values for the edge weights, and two sets of random values for the vertex weights that are summed to create edge weights. We have used two ranges of weights, integers in the range , and real numbers in to see how the range of weights influences the runtimes of the algorithms. Note that the runtime needed for the first three choices of weights is three to four orders of magnitude smaller than the times for the last two sets of weights. We also break down the time taken for the augmenting path searches and the dual weight updates. The runtime is largely accounted for by the time taken for augmenting path searches for every experiment but the last. The large increase in the runtime for random vertex weights is caused by the increase in the average length of an augmenting paths, and for the case of real-valued weights, the larger time to process real-valued dual weights. Note that the range of weights does not influence the runtimes significantly for edge weights. However, when real-valued weights in are used for vertex weights, the algorithm requires nearly hours! In this case, integer weights cause the algorithm to take hours. We leave a more thorough evaluation of all the causes for this for the future. However, these results support our contention that MVM problems should be solved by algorithms designed specifically for them, rather than by converting them to MEM problems and then using MEM algorithms.

9 Conclusions

We have described a -approximation algorithm for MVM in bipartite graphs whose time complexity is , whereas the time complexity of an Exact algorithm is . The algorithm exploits the bipartiteness of the graph by decomposing the problem into two ‘one-side-weighted’ problems, solving them individually, and then combining the two matchings into the final matching. The algorithm also sorts the weights, processing the unmatched vertices in non-decreasing order of weights. The approximation algorithm is derived in a natural manner from an algorithm for computing maximum weighted matchings by restricting the length of augmenting paths to at most three.

The -approximation algorithm has been implemented in C++, and on a set of nineteen graphs, some with millions of vertices and edges, it computes the approximate matchings in less than seconds on a desktop processor. The weight of the approximate matching is greater than of the weight of the optimal matching for these problems. A Greedy -approximation algorithm is faster than the -approximation algorithm by about a factor of , but the weight it computes is lower, and can be as low as on the worst problem. A path on four vertices , where the sum of the weights of vertices and is the heaviest over all consecutive pairs of vertices, is a contrived worst-case example for the Greedy algorithm, but the -approximation algorithm computes the maximum weight matching. Several copies of the path can be joined together in a suitable manner to construct larger graphs where the same property holds. Whether the the Greedy - or the -approximation algorithm is to be preferred, trading off run time for increased weight, would depend on the context in which it is being used. For example, it is known from experience that the -approximation MEM algorithms do not lead to good orderings for sparse Gaussian elimination. Recent work suggests that implementations of the -approximation algorithm leads to better matrix orderings for this problem [1].

The Exact MVM algorithm shows that the “structure” of the vertex weighted matching problem is closer to the maximum cardinality matching problem rather than the maximum edge-weighted matching problem (MEM), in that we do not need to invoke linear programming duality, and compute and update dual weights.

We have also implemented the -approximation algorithm for maximum edge-weighted matching, based on scaling the weights, designed by Duan and Pettie [10]. This algorithm is quite sophisticated, and can be applied to the MVM problem by transforming it into an MEM problem. However, our results show that it is an order of magnitude or more slower than the -approximation algorithm for MVM; it also obtains lower weights for the approximate matching, even when we seek a -approximation. An approximation algorithm for MEM analogous to the -approximation algorithm for MVM is not known that works on augmenting path lengths.

Recent developments in half-approximation algorithms for MEM (e.g., the Locally Dominant edge and Suitor algorithms [24]) show that we should be able to use these algorithms that avoid sorting and obtain -approximations for the MVM. Could similar algorithms be developed for the MVM problem to obtain -approximation without sorting? The speculative approach to solving the MVM problem employs a different strategy by first computing a MCM and then using increasing paths to improve the weight. This is the scope of our current work.

The proof technique used in this paper cannot be extended to obtain approximation ratios higher than , since Lemma 2 does not hold for higher (augmenting or increasing) path lengths. Consideration of the MEM problem again suggests that there are other approaches that would lead to better approximation ratios, although they might not necessarily lead to practical improvements, given the high matching weights obtained from the approximation algorithms discussed here.

Finally, we believe that the idea of restricting the augmenting path length to at most three could lead to a -approximation algorithm for non-bipartite graphs, although we will no longer be able to invoke the Mendelsohn-Dulmage theorem, and a different proof technique will be required.

References

  • [1] A. Azad, A. Buluc, X. S. Li, X. Wang, and J. Langguth, A distributed memory approximation algorithm for maximum weight perfect bipartite matching. Arxiv:1801.09809v1, 2018.
  • [2] A. Azad, A. Buluc, and A. Pothen, Computing maximum cardinality matchings in parallel on bipartite graphs via tree grafting, IEEE Transactions on Parallel and Distributed Systems, 28 (2017), pp. 44–59.
  • [3] C. E. Bell, Weighted matching with vertex weights: An application to scheduling training sessions in NASA space shuttle cockpit simulators, European Journal of Operational Research, 73 (1994), pp. 443–449.
  • [4] R. Burkard, M. Dell’Amico, and S. Martello, Assignment Problems, SIAM, Philadelphia, PA, 2009.
  • [5] T. F. Coleman and A. Pothen, The null space problem II. Algorithms, SIAM J. Algebraic Discrete Methods, 8 (1987), pp. 544–563.
  • [6] T. A. Davis and Y. Hu, The University of Florida Sparse Matrix Collection, ACM Transactions on Mathematical Software, 38 (2011), pp. 1:1–1:25.
  • [7] F. Dobrian, M. Halappanavar, and A. Pothen, Exact and approximation algorithms for vertex weighted matching. Preprint, March 2010.
  • [8] D. E. Drake and S. Hougardy, A simple approximation algorithm for the weighted matching problem, Inf. Process. Lett., 85 (2003), pp. 211–213, doi:http://dx.doi.org/10.1016/S0020-0190(02)00393-9.
  • [9] D. E. Drake Vinkemeier and S. Hougardy, A linear-time approximation algorithm for weighted matchings in graphs, ACM Trans. Algorithms, 1 (2005), pp. 107–122, doi:http://doi.acm.org/10.1145/1077464.1077472.
  • [10] R. Duan and S. Pettie, Linear-time approximation for maximum weight matching, Journal of the ACM, 61 (2014), pp. 1–23, doi:http://dx.doi.org/10.1145/2529989.
  • [11] I. S. Duff, K. Kaya, and B. Ucar, Design, implementation and analysis of maximum transversal algorithms, ACM Transactions on Mathematical Software, 38 (2011), pp. 13:1–13:31.
  • [12] I. S. Duff and J. Koster, On algorithms for permuting large entries to the diagonal of a sparse matrix, SIAM J. Matrix Anal. Appl., 22 (2000), pp. 973–996, doi:http://dx.doi.org/10.1137/S0895479899358443.
  • [13] Z. Galil, Efficient algorithms for finding maximum matching in graphs, ACM Comput. Surv., 18 (1986), pp. 23–38, doi:http://doi.acm.org/10.1145/6462.6502.
  • [14] M. Halappanavar, Algorithms for Vertex-weighted Matchings in Graphs, PhD thesis, Old Dominion University, Norfolk, VA, 2009.
  • [15] M. Halappanavar, A. Pothen, A. Azad, F. Manne, J. Langguth, and A. Khan, Codesign lessons learned from implementing graph matching on multithreaded architectures, IEEE Computer, (2015), pp. 32–41.
  • [16] J. Hopcroft and R. Karp, A algorithm for maximum matchings in bipartite graphs, SIAM J. Comput., 2 (1973), pp. 225–231.
  • [17] S. Hougardy, Linear time approximation algorithms for degree constrained subgraph problems

    , in Research Trends in Combinatorial Optimization, W. J. Cook, L. Lovasz, and J. Vygen, eds., Springer Verlag, 2009, pp. 185–200.

  • [18] A. Khan, D. Gleich, M. Halappanavar, and A. Pothen, A multithreaded algorithm for network alignment via approximate matching, in Proceedings of Supercomputing (SC12), ACM/IEEE, 2012. Article No. 64, 11 pp.
  • [19] A. Khan and A. Pothen, A new -approximation algorithm for the -edge cover problem, in Proceedings of the Seventh SIAM Workshop on Combinatorial Scientific Computing, 2016, p. 10 pp.
  • [20] A. Khan, A. Pothen, M. Patwary, N. Satish, N. Sunderam, F. Manne, M. Halappanavar, and P. Dubey, Efficient approximation algorithms for weighted -Matching, SIAM J. Scientific Computing, 38 (2016), pp. S593–S619.
  • [21] A. Khan, A. Pothen, and S M Ferdous, Parallel algorithms through approximation: -edge covers, in Proceedings of 32nd International Symposium on Parallel and Distributed Processing (IPDPS), 2018, p. 10 pp.
  • [22] E. Lawler, Combinatorial Optimization: Networks and Matroids, Dover Publications, Mineola, New York, 1976.
  • [23] L. Lovasz and M. D. Plummer, Matching Theory (North-Holland Mathematics Studies), Elsevier Science Ltd, 1986.
  • [24] F. Manne and M. Halappanavar, New effective multithreaded matching algorithms, in 28th IEEE International Parallel and Distributed Processing Symposium, IEEE, 2014, pp. 519–528.
  • [25] J. Maue and P. Sanders, Engineering algorithms for approximate weighted matching, in Experimental Algorithms, Springer, 2007, pp. 242–255.
  • [26] N. S. Mendelsohn and A. L. Dulmage, Some generalizations of the problem of distinct representatives, Canadian Journal of Mathematics, 10 (1958), pp. 230–241.
  • [27] K. Mulmuley, U. V. Vazirani, and V. V. Vazirani, Matching is as easy as matrix inversion

    , in STOC ’87: Proceedings of the nineteenth annual ACM conference on Theory of computing, New York, NY, USA, 1987, ACM, pp. 345–354,

    doi:http://doi.acm.org/10.1145/28395.383347.
  • [28] C. H. Papadimitriou and K. Steiglitz, Combinatorial Optimization: Algorithms and Complexity, Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 1982.
  • [29] S. Pettie and P. Sanders, A simpler linear time approximation for maximum weight matching, Inf. Process. Lett., 91 (2004), pp. 271–276, doi:http://dx.doi.org/10.1016/j.ipl.2004.05.007.
  • [30] A. Pinar, E. Chow, and A. Pothen, Combinatorial algorithms for computing column space bases that have sparse inverses, Electronic Transactions on Numerical Analysis, 22 (2006), pp. 122–145.
  • [31] A. Pothen and C.-J. Fan, Computing the block triangular form of a sparse matrix, ACM Transactions on Mathematical Software (TOMS), 16 (1990), pp. 303–324.
  • [32] R. Preis, Linear time -approximation algorithm for maximum weighted matching in general graphs, in 16th Ann. Symp. on Theoretical Aspects of Computer Science (STACS), 1999, pp. 259–269.
  • [33] A. Schrijver, Combinatorial Optimization - Polyhedra and Efficiency. Volume A: Paths, Flows, Matchings, Springer, 2003.
  • [34] T. H. Spencer and E. W. Mayr, Node weighted matching, in Proceedings of the 11th Colloquium on Automata, Languages and Programming, London, UK, 1984, Springer-Verlag, pp. 454–464.
  • [35] V. Tabatabaee, L. Georgiadis, and L. Tassiulas, QoS provisioning and tracking fluid policies in input queueing switches, IEEE/ACM Trans. Netw., 9 (2001), pp. 605–617, doi:http://dx.doi.org/10.1109/90.958329.
  • [36] V. V. Vazirani, Approximation Algorithms, Springer, 2003.
  • [37] D. P. Williamson and D. B. Shmoys, The Design of Approximation Algorithms, Cambridge University Press, New York, NY, 2011.