    Linear-Time Algorithms for Maximum-Weight Induced Matchings and Minimum Chain Covers in Convex Bipartite Graphs

A bipartite graph G=(U,V,E) is convex if the vertices in V can be linearly ordered such that for each vertex u∈ U, the neighbors of u are consecutive in the ordering of V. An induced matching H of G is a matching such that no edge of E connects endpoints of two different edges of H. We show that in a convex bipartite graph with n vertices and m weighted edges, an induced matching of maximum total weight can be computed in O(n+m) time. An unweighted convex bipartite graph has a representation of size O(n) that records for each vertex u∈ U the first and last neighbor in the ordering of V. Given such a compact representation, we compute an induced matching of maximum cardinality in O(n) time. In convex bipartite graphs, maximum-cardinality induced matchings are dual to minimum chain covers. A chain cover is a covering of the edge set by chain subgraphs, that is, subgraphs that do not contain induced matchings of more than one edge. Given a compact representation, we compute a representation of a minimum chain cover in O(n) time. If no compact representation is given, the cover can be computed in O(n+m) time. All of our algorithms achieve optimal running time for the respective problem and model. Previous algorithms considered only the unweighted case, and the best algorithm for computing a maximum-cardinality induced matching or a minimum chain cover in a convex bipartite graph had a running time of O(n^2).

Authors

03/25/2019

A Weighted Approach to the Maximum Cardinality Bipartite Matching Problem with Applications in Geometric Settings

We present a weighted approach to compute a maximum cardinality matching...
11/20/2021

Distributed CONGEST Approximation of Weighted Vertex Covers and Matchings

We provide CONGEST model algorithms for approximating minimum weighted v...
04/18/2020

Mapping Matchings to Minimum Vertex Covers: Kőnig's Theorem Revisited

It is a celebrated result in early combinatorics that, in bipartite grap...
04/25/2019

On adaptive algorithms for maximum matching

In the fundamental Maximum Matching problem the task is to find a maximu...
02/01/2021

Graphs of Joint Types, Noninteractive Simulation, and Stronger Hypercontractivity

In this paper, we introduce the concept of a type graph, namely a bipart...
06/25/2018

Data Reduction for Maximum Matching on Real-World Graphs: Theory and Experiments

Finding a maximum-cardinality or maximum-weight matching in (edge-weight...
12/08/2021

Blocking Trails for f-factors of Multigraphs

Blocking flows, introduced by Dinic  for network flow, have been used...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Problem Statement.

A bipartite graph is convex if can be numbered as so that the neighbors of every vertex form an interval , see Figure 1(a). For such graphs, we consider the problem of computing an induced matching (a) of maximum cardinality or (b) of maximum total weight, for graphs with edge weights.

An induced matching is a matching that results as a subgraph induced by some subset of vertices. This amounts to requiring that no edge of connects endpoints of two different edges of , see Figure 1(a). In terms of the line graph, an induced matching is an independent set in the square of the line graph. The square of a graph connects every pair of nodes whose distance is one or two. Accordingly, we call two edges of independent if they can appear together in an induced matching, or in other words, if their endpoints induce a (a disjoint union of two edges) in . Otherwise, they are called dependent. Fig. 1: (a) A convex bipartite graph G=(U,V,E) containing an induced matching H of size 3. Since we use successive natural numbers as elements of U and V, we will explicitly indicate whether we regard a number x as a vertex of U or of V. There is no induced matching with more than 3 edges: vertex 3∈U is adjacent to all vertices of V except 1∈V. Thus, if we match 3∈U, this can only lead to induced matchings of size at most 2. Furthermore, we cannot simultaneously match 1∈U and 2∈U since every neighbor of 2∈U is also adjacent to 1∈U. (b) A minimum chain cover of G with 3 chain subgraphs Z1,Z2,Z3 (in different colors and dash styles), providing an independent proof that H is optimal. Here, Z1,Z2,Z3 have disjoint edge sets, which is not necessarily the case in general. (c) The compact representation of G.

In convex bipartite graphs, maximum-cardinality induced matchings are dual to minimum chain covers. A chain graph is a bipartite graph that contains no induced matching of more than one edge, i. e., it contains no pair of independent edges. (Chain graphs are also called difference graphs  or non-separable bipartite graphs .) A chain cover of a graph  with edge set is a set of chain subgraphs  of  such that the union of the edge sets of  is , see Figure 1(b). A chain cover with chain subgraphs provides an obvious certificate that the graph cannot contain an induced matching with more than edges. We will elaborate on this aspect of a chain cover as a certificate of optimality in Section 5. A minimum chain cover of  is a chain cover with a smallest possible number of chain subgraphs. In a convex bipartite graph , the maximum size of an induced matching is equal to the minimum number of chain subgraphs of a chain cover .

We denote the number of vertices by , , , and the number of edges by . If a convex graph is given as an ordinary bipartite graph without the proper numbering of , it can be transformed into this form in linear time  . (In terms of the bipartite adjacency matrix, convexity is the well-known consecutive-ones property.) Unweighted convex bipartite graphs have a natural implicit representation  of size , which is often called a compact representation [13, 20]: every interval is given by its endpoints and , see Figure 1(c). Since the numbering of  can be computed in  time, it is easy to obtain a compact representation in total time  [20, 22]. The chain covers that we construct will consist of convex bipartite subgraphs with the same ordering of as the original graph. Thus, we will be able to use the same representation for the chain graphs of a chain cover.

Related Work and Motivation.

The problem of finding an induced matching of maximum size was first considered by Stockmeyer and Vazirani  as the “risk-free marriage problem” with applications in interference-free network communication. The decision version of the problem is known to be -complete in many restricted graph classes [4, 16, 15], in particular bipartite graphs [4, 16] that are -free  or have maximum degree  . On the other hand, it can be solved in polynomial time in chordal graphs , weakly chordal graphs , trapezoid graphs, -interval-dimension graphs and co-comparability graphs , amongst others. For a more exhaustive survey we refer to .

The class of convex bipartite graphs was introduced by Fred Glover , who motivates the computation of matchings in these graphs with industrial manufacturing applications. Items that can be matched when some quantity fits up to a certain tolerance naturally lead to convex bipartite graphs. The computation of matchings in convex bipartite graphs also corresponds to a scheduling problem of tasks of discrete length on a single disjunctive resource . The problem of finding a (classic, not induced) matching of maximum cardinality in convex bipartite graphs has been studied extensively [10, 22, 9] culminating in an algorithm when a compact representation of the graph is given . Several other combinatorial problems have been studied in convex bipartite graphs. While some problems have been shown to be -complete even if restricted to this graph class , many problems that are -hard in general can be solved efficiently in convex bipartite graphs. For example, a maximum independent set can be found in  time (assuming a compact representation)  and the existence of Hamiltonian cycles can be decided in  time . For a comprehensive summary we refer to .

One of the applications given by Stockmeyer and Vazirani  for the induced matching problem can be stated as follows. We want to test (or use) a maximum number of connections between receiver-sender pairs in a network. However, testing a particular connection produces noise so that no other node in reach may be tested simultaneously. We remark that this type of motivation extends very naturally to convex bipartite graphs when we consider wireless networks in which nodes broadcast or receive messages in specific frequency ranges. Further, weighted edges can model the importance of connections.

Previous Work.

Yu, Chen and Ma  describe an algorithm that finds both a maximum-cardinality induced matching and a minimum chain cover in a convex bipartite graph in runtime . Their procedure is improved by Brandstädt, Eschen and Sritharan , resulting in a runtime of . Chang  computes maximum-cardinality induced matchings and minimum chain covers in  time in bipartite permutation graphs, which form a proper subclass of convex bipartite graphs. Recently, Pandey, Panda, Dane and Kashyap  gave polynomial algorithms for finding a maximum-cardinality induced matching in circular-convex and triad-convex bipartite graphs. These graph classes generalize convex bipartite graphs.

Our Contribution.

We improve the previous best  algorithm  for maximum-cardinality induced matching and minimum chain covers in convex bipartite graphs in several ways. In Section 2 we give an algorithm for finding maximum-weight induced matchings in convex bipartite graphs with  runtime. The weighted problem has not been considered before. In Section 3 we specialize our algorithm to find induced matchings of maximum cardinality in  runtime, given a compact representation of the graph. In Section 4 we extend this approach to obtain in  time a compact representation of a minimum chain cover. If no compact representation is given, our approach is easily adapted to produce a minimum chain cover in  time.

All of our algorithms achieve optimal running time for the respective problem and model. Our results for finding a maximum-cardinality induced matching also improve the running times of the algorithms of Pandey et al.  for the circular-convex and triad-convex case, as they use the convex case as a building block.

2 Maximum-Weight Induced Matchings

In this section, we compute a maximum-weight induced matching of a given edge-weighted convex bipartite graph  in time . We generally write indices as superscripts and indices as subscripts. We consider as a subset of . We assume that is numbered as described in Section 1 and the interval of each vertex  is given by the pair (,) of the left and right endpoint. Each edge  has a weight .

Our dynamic-programming approach considers the following subproblems: For an edge , we define as the cost of the maximum-weight induced matching that uses the edge and contains only edges in . The following dynamic-programming recursion computes :

 Wij=Cij+max{Wi′j′∣Ri′

The range over which the maximum is taken is illustrated in Figure 2. Fig. 2: The table entries that go into the computation of Wij are shaded: They lie in rows that end to the left of Wij (marked by an arrow), and only the entries to the left of Li are considered.

In this recursion, we build the induced matching of weight by adding the edge to some induced matching  of weight . We want to be an induced matching: By construction, the edge is independent of , but we have to show that the other edges of are also independent of . In order to prove this (Lemma 2), we use a transitivity relation between independent edge pairs.

Observation 1.

Two edges and are independent if and only if and .

Lemma 1.

Let with . Assume that and are independent, and and are independent. Then and are independent.

Proof.

By Observation 1, we have and . Thus, and . ∎

Lemma 2.

The recursion (1) is correct.

Proof.

By Observation 1, any edge with that is independent of  satisfies and . By Lemma 1, all other edges  used to obtain the matching value  are also independent of . ∎

We create a table in which we record the entries . We assume that the intervals are sorted in nondecreasing order by , that is, for . The values form the -th row of the table. We fill the table row by row proceeding from  to . Each row  is processed from left to right.

The only challenge in evaluating (1) is the maximum-expression, for which we introduce the notation .

 Mij=max{Wi′j′∣Ri′

We discuss the computation of the leftmost entry later. When we proceed from  to  we want to go incrementally from to . Direct comparison of the respective defining sets leads to

 Mij+1=max {Mij}∪{Wi′j′∣Ri′=j, j′ Fig. 3: Example. We are in the process of filling row 30 from left to right. All rows with smaller index i have been processed and are filled with the entries Wij. Unprocessed entries are marked as “–”. The figure does not show the rows in the order in which they are processed, but intervals with the same right endpoint Ri=r are grouped together. The bold entries collect the provisional maxima Pr in each group. By way of example, the encircled entry P27=54 is the maximum among the shaded entries of the intervals that end at Ri=27, ignoring the yet unprocessed entries. As we proceed from j=27 to j=28 in row 30, the intervals with Ri=27 become relevant. The maximum usable entry from these intervals is found in position 17 of this array, because 17=L30−1. The entry P27=44 is marked by an arrow. The next entry W3028 will be computed as C3028+max{P27,P26,…,P17}. (Some of these entries might not exist.) We can observe that the minimum over which P27 is defined involves no unprocessed entries (Lemma 3). When the next row i=34 in the group with Ri=27 is later filled, it will be necessary to update P27.

In order to evaluate the maximum of the second set in (2) efficiently, we group intervals with a common right endpoint together. Let be the earliest startpoint of an interval with endpoint . If there are no intervals with endpoint , we set . (It would be more logical to set in this case, but this choice makes the algorithm simpler.) We maintain an array for that is defined as follows:

 Pr[j]:=max{Wi′j′∣ Ri′=r, row i′ has already been processed, j′≤j}∪{0}

In a sense, is a provisional version of the expression , which takes into account only the already processed rows. For (2), we need the entry , and we will see that all relevant entries have already been computed whenever we access this entry. Thus, we rewrite (2):

 Mij+1={max{Mij,Pj[Li−1]},if Li−1≥Sj and, thus, Pj[Li−1] is definedMij,otherwise (3)

The condition ensures that the array index does not exceed the left boundary of the array . Also, the index never exceeds the right boundary of the array , since , and therefore . Thus, is always defined when it is accessed.

Lemma 3.

When entry is processed, (2) and (3) define the same quantity .

Proof.

We distinguish three cases.

Case 1: No interval ends at , and accordingly, .

In this case in (2) since its rightmost set is empty. Since , we have and, thus, the right side of (3) evaluates also to .

Case 2: There exists an interval ending at , and . The right side of (3) evaluates to . In (2), intervals that end at have . Thus, an edge  with  and  does not exist, and the second set in (2) is empty. Therefore, (2) evaluates to .

Case 3: There exists an interval ending at , and . In this case, is defined:

 Pj[Li−1]=max{Wi′j′∣Ri′=j, j′≤Li−1, row i′ already processed} (4)

For each entry with , we conclude that  and, thus, row has already been processed. This means that the condition that row was processed is redundant, and (4) coincides with the right side of (2). ∎

After processing row  with startpoint  and endpoint , we have to update the values in . This is straightforward. Figure 3 illustrates the role of the arrays when processing a row.

It remains to discuss the computation of the first value  of the row. An edge  and edge are independent if and only if the interval  ends before , that is . Since we process the intervals in nondecreasing order by their startpoints, it suffices to maintain a value  with the maximum  in all finished intervals: those intervals that end before . In other words . This value is easily maintained by updating  as  increases. The full details are stated as Algorithm 1.

The update of the array in the second loop can be integrated with the computation of in the first loop. When this is done, the values need not be stored at all because they are not used. As stated earlier, when no interval ends at a point , we set . The array  consists of a single dummy entry . This way we avoid having to treat this special cases during the algorithm.

We have described the computation of the value of the optimal matching. It is straightforward to augment the program so that the optimal matching itself can be recovered by backtracking how the optimal value was obtained, but this would clutter the program.

Theorem 1.

A maximum-weight induced matching of an edge-weighted convex bipartite graph can be computed in time.

3 Maximum-Cardinality Induced Matchings

For the unweighted version of the problem, we assume a compact representation of a convex bipartite graph , that is, for each  we are given the startpoint  and endpoint  of its interval . This makes it possible to obtain a linear runtime of .

The recursion (1) can be specialized to the unweighted case by setting .

 Wij=1+max{Wi′j′∣Ri′

This recursion has already been stated in  and  in a slightly different formulation. Yu, Chen and Ma  describe it as a greedy-like procedure that “colors” the edges of a bipartite graph with the values . From this coloring, they obtain both a maximum-cardinality induced matching and a minimum chain cover. The original implementation given in  runs in time . Brandstädt, Eschen and Sritharan  give an improved implementation of the coloring procedure with runtime . Our Algorithm 1 from Section 2 obtains the values  in total time .

Given a compact representation, we can exploit some structural properties of the filled dynamic-programming table to further improve the runtime to . The following observations were first given in  and .

Lemma 4 ([24, Lemma 5]).

The values are nondecreasing in each row.

Proof.

This is obvious from (5), since the set over which the maximum is taken increases with . ∎

Lemma 5 ([3, Lemma 3.3, Lemma 3.4]).

Each row contains at most two consecutive values.

Proof.

Let be the largest value in some row . Then, if we take a corresponding matching of size , it is easy to see that we can remove the last two edges and replace them by an arbitrary edge . This proves that .

More formally, we can argue by the recursion (5): Assume there are values  in row . By Lemma 4 we can assume . By (5),  with  for some . Thus, and by definition of  according to (5) we have , which is a contradiction. ∎

Specializing Algorithm 1 to the unweighted case leads to a solution with running time. Our -time algorithm will follow the general scheme of Algorithm 1, with the following modifications.

• In view of Lemmas 4 and 5, we will not fill each row individually, but we will just determine the leftmost value and the position where the entries switch from to (if any).

• The computation of the leftmost entry is exactly as in Algorithm 1.

• The position where the entries of row switch from to can be determined from (5): If there is a row containing an entry left of , then must be as soon as . The algorithm determines the threshold position as the smallest right endpoint under these constraints. Then the entries in row start at if these entries are still part of the row.

• We do not maintain the whole array for each , but only its last entry ; this is sufficient for updating and thus for computing the leftmost entries in the rows. We call this value .

We will improve Algorithm 2 by maintaining the values instead of computing them from scratch. We use the fact that the smallest value in the row is known, and hence we can associate with the value instead of the row index , as is already apparent from our chosen notation. We update whenever increases. The details are shown in Algorithm 3. The differences to Algorithm 2 are marked by .

This still does not achieve running time. The final improvement comes from realizing that it is sufficient to update when is the leftmost entry in row . The time when such an update occurs can be predicted when a row is generated. To this end, we maintain a list for that records the updates that are due when becomes . This final version is Algorithm 4.

The runtime of Algorithm 4 is : Processing each interval takes constant time and adds at most two pairs to the lists . Thus, processing the lists for updating the array takes also only time.

Some simplifications are possible: The addition of to the list in the case of two values can actually be omitted, as it leads to no decrease in : is already . The algorithm could be further streamlined by observing that at most two consecutive values of need to be remembered at any time.

Again, it is easy to modify the algorithm to return a maximum induced matching in addition to its size.

Theorem 2.

Given a compact representation, a maximum-cardinality induced matching of a convex bipartite graph can be computed in time.

4 Minimum Chain Covers

In convex bipartite graphs, the size of a maximum-cardinality induced matching equals the number of chain subgraphs of a minimum chain cover . In this section we use this duality and extend our Algorithm 4 to obtain a minimum chain cover of a convex bipartite graph .

Let  be the cardinality of a maximum induced matching of . Accordingly, the values  cover the range . We create  chain subgraphs  of . The edges with  will be part of the chain subgraph .

As already observed in , the edges with a fixed value of may contain independent edges and, thus, do not necessarily constitute a chain graph. Accordingly, Yu, Chen, and Ma  describe a strategy to extend the edge set for each value of  to a chain graph . Their original implementation runs in time . Brandstädt, Eschen, and Sritharan  give an improved implementation with runtime . We implement their strategy in  time, given a compact representation. The correctness was already shown in . We give a new independent proof. The following characterization is often used as an alternative definition of chain graphs:

Lemma 6.

A bipartite graph is a chain graph if and only if the sets of neighbors of the vertices form a chain in the inclusion order. (Equal sets are allowed.) In other words, among any two sets and , one must be contained in the other.

Proof.

This is a direct consequence of the fact that edges and are independent if and only if and . ∎

The condition that the neighborhoods must form a chain is apparently the reason for calling these graphs chain graphs, however, we did not find a reference for this.

We use  to denote the set of rows that contain entries . For every row , we determine the beginning and ending points  with this color, that is, . We extend every such interval  to the left by choosing a new starting point  according to the formula

 ^Biw :=min{Biw}∪{Bi′w∣i′∈Uw, Ei′w

The second expression uses the new values on the right-hand side. It is easy to see that the two expressions are equivalent: Using (6) for the definition of , the expression (7) becomes

 min{Biw}∪{Bi′w∣i′∈Uw, Ei′w

The third set is contained in the second set, and thus, (8) is equal to  according to (6).

We construct the chain graph as the graph with the extended intervals . Figure 4 shows an example. Fig. 4: An example showing a section of the computation of Wij by Algorithm 4. The threshold values t6 and t7 are shown as they change with the rows that are successively considered. The shaded entries form the chain subgraph Z7 that is used for the chain cover.

It is obvious by construction that these intervals satisfy the conditions of a chain graph: By Lemma 6, we have to show that there are no two intervals , with and . But if the last condition holds, (7) ensures that .

The only thing that could go wrong is that becomes too small so that the chain graph is not a subgraph of . The following lemma shows that this is not the case.

for every .

Proof.

For the sake of contradiction, assume . By (6), there is a row  such that  and . Setting and in the recursion (5), we conclude that , because otherwise, (5) would imply . Thus, is an edge of . By Lemma 5, . By (5), there is an edge  with , and . Again by (5), such an edge would imply that , a contradiction. ∎

Algorithm 5 carries out the computation of (6). It processes the triplets in increasing order of the endpoints . This can be done in linear time, by first sorting the triples into buckets according to the value of . Thus, Algorithm 5 takes linear time . By Lemma 6, the result is a chain cover, which by duality is minimum. Each row belongs to at most two chain subgraphs, and thus the chain cover consists of at most such row intervals in total. It is straightforward to extend Algorithm 4 to compute the sets  and the quantities , and thus the cover can be constructed in time in compressed form.

Theorem 3.

Given a compact representation of a convex bipartite graph, a compact representation of a minimum chain cover can be computed in  time.

Given a compact representation of a minimum chain cover, we can list all the edges of its chain subgraphs in  time since every edge is contained in at most two chain subgraphs. As mentioned in the introduction, a compact representation of a convex bipartite graph can be computed in  time [20, 22, 2]. Thus, Algorithm 4 and Algorithm 5 can also be used to obtain:

Theorem 4.

A minimum chain cover of a convex bipartite graph can be computed in  time.

5 Certification of Optimality

An induced matching together with a chain cover of the same cardinality provides a certificate of optimality, of size . As we will establish in the following discussion, it is easy to check this certificate for validity in linear time. This is easier than constructing the largest induced matching with our algorithm. Thus, it is possible to establish correctness of the result beyond doubt, for each particular instance of the problem, without having to trust the correctness of our algorithms and their implementations, see  for a survey about this concept.

It is trivial to check whether the matching is contained in the graph. To test whether it forms an induced matching, we sort the edges by . This takes time with bucket-sort. Then, by Lemma 1, it is sufficient to test consecutive edges for independence, and each such test takes only constant time according to Observation 1.

To establish the validity of a chain cover , we need to check that the edges of  are covered and each is a chain subgraph. The chain subgraphs , for are compactly represented by a set of at most quadruples . The following checking procedure works in linear time for any chain cover as long as it consists of convex bipartite subgraphs. It does not use any special properties of the cover produced by our algorithm.

We sort the quadruples lexicographically. Then it is easy to check the chain graph property using the characterization of Lemma 6: The intervals that belong to a fixed chain graph