Linear-Time Algorithms for Maximum-Weight Induced Matchings and Minimum Chain Covers in Convex Bipartite Graphs

11/13/2017
by   Boris Klemz, et al.
0

A bipartite graph G=(U,V,E) is convex if the vertices in V can be linearly ordered such that for each vertex u∈ U, the neighbors of u are consecutive in the ordering of V. An induced matching H of G is a matching such that no edge of E connects endpoints of two different edges of H. We show that in a convex bipartite graph with n vertices and m weighted edges, an induced matching of maximum total weight can be computed in O(n+m) time. An unweighted convex bipartite graph has a representation of size O(n) that records for each vertex u∈ U the first and last neighbor in the ordering of V. Given such a compact representation, we compute an induced matching of maximum cardinality in O(n) time. In convex bipartite graphs, maximum-cardinality induced matchings are dual to minimum chain covers. A chain cover is a covering of the edge set by chain subgraphs, that is, subgraphs that do not contain induced matchings of more than one edge. Given a compact representation, we compute a representation of a minimum chain cover in O(n) time. If no compact representation is given, the cover can be computed in O(n+m) time. All of our algorithms achieve optimal running time for the respective problem and model. Previous algorithms considered only the unweighted case, and the best algorithm for computing a maximum-cardinality induced matching or a minimum chain cover in a convex bipartite graph had a running time of O(n^2).

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

03/25/2019

A Weighted Approach to the Maximum Cardinality Bipartite Matching Problem with Applications in Geometric Settings

We present a weighted approach to compute a maximum cardinality matching...
11/20/2021

Distributed CONGEST Approximation of Weighted Vertex Covers and Matchings

We provide CONGEST model algorithms for approximating minimum weighted v...
04/18/2020

Mapping Matchings to Minimum Vertex Covers: Kőnig's Theorem Revisited

It is a celebrated result in early combinatorics that, in bipartite grap...
04/25/2019

On adaptive algorithms for maximum matching

In the fundamental Maximum Matching problem the task is to find a maximu...
02/01/2021

Graphs of Joint Types, Noninteractive Simulation, and Stronger Hypercontractivity

In this paper, we introduce the concept of a type graph, namely a bipart...
06/25/2018

Data Reduction for Maximum Matching on Real-World Graphs: Theory and Experiments

Finding a maximum-cardinality or maximum-weight matching in (edge-weight...
12/08/2021

Blocking Trails for f-factors of Multigraphs

Blocking flows, introduced by Dinic [2] for network flow, have been used...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Problem Statement.

A bipartite graph is convex if can be numbered as so that the neighbors of every vertex form an interval , see Figure 1(a). For such graphs, we consider the problem of computing an induced matching (a) of maximum cardinality or (b) of maximum total weight, for graphs with edge weights.

An induced matching is a matching that results as a subgraph induced by some subset of vertices. This amounts to requiring that no edge of connects endpoints of two different edges of , see Figure 1(a). In terms of the line graph, an induced matching is an independent set in the square of the line graph. The square of a graph connects every pair of nodes whose distance is one or two. Accordingly, we call two edges of independent if they can appear together in an induced matching, or in other words, if their endpoints induce a (a disjoint union of two edges) in . Otherwise, they are called dependent.

Fig. 1: (a) A convex bipartite graph  containing an induced matching of size 3. Since we use successive natural numbers as elements of and , we will explicitly indicate whether we regard a number as a vertex of or of . There is no induced matching with more than 3 edges: vertex  is adjacent to all vertices of  except . Thus, if we match , this can only lead to induced matchings of size at most . Furthermore, we cannot simultaneously match and since every neighbor of is also adjacent to . (b) A minimum chain cover of  with  chain subgraphs  (in different colors and dash styles), providing an independent proof that is optimal. Here, have disjoint edge sets, which is not necessarily the case in general. (c) The compact representation of .

In convex bipartite graphs, maximum-cardinality induced matchings are dual to minimum chain covers. A chain graph is a bipartite graph that contains no induced matching of more than one edge, i. e., it contains no pair of independent edges. (Chain graphs are also called difference graphs [12] or non-separable bipartite graphs [7].) A chain cover of a graph  with edge set is a set of chain subgraphs  of  such that the union of the edge sets of  is , see Figure 1(b). A chain cover with chain subgraphs provides an obvious certificate that the graph cannot contain an induced matching with more than edges. We will elaborate on this aspect of a chain cover as a certificate of optimality in Section 5. A minimum chain cover of  is a chain cover with a smallest possible number of chain subgraphs. In a convex bipartite graph , the maximum size of an induced matching is equal to the minimum number of chain subgraphs of a chain cover [24].

We denote the number of vertices by , , , and the number of edges by . If a convex graph is given as an ordinary bipartite graph without the proper numbering of , it can be transformed into this form in linear time  [2]. (In terms of the bipartite adjacency matrix, convexity is the well-known consecutive-ones property.) Unweighted convex bipartite graphs have a natural implicit representation [21] of size , which is often called a compact representation [13, 20]: every interval is given by its endpoints and , see Figure 1(c). Since the numbering of  can be computed in  time, it is easy to obtain a compact representation in total time  [20, 22]. The chain covers that we construct will consist of convex bipartite subgraphs with the same ordering of as the original graph. Thus, we will be able to use the same representation for the chain graphs of a chain cover.

Related Work and Motivation.

The problem of finding an induced matching of maximum size was first considered by Stockmeyer and Vazirani [23] as the “risk-free marriage problem” with applications in interference-free network communication. The decision version of the problem is known to be -complete in many restricted graph classes [4, 16, 15], in particular bipartite graphs [4, 16] that are -free [16] or have maximum degree  [16]. On the other hand, it can be solved in polynomial time in chordal graphs [4], weakly chordal graphs [5], trapezoid graphs, -interval-dimension graphs and co-comparability graphs [11], amongst others. For a more exhaustive survey we refer to [8].

The class of convex bipartite graphs was introduced by Fred Glover [10], who motivates the computation of matchings in these graphs with industrial manufacturing applications. Items that can be matched when some quantity fits up to a certain tolerance naturally lead to convex bipartite graphs. The computation of matchings in convex bipartite graphs also corresponds to a scheduling problem of tasks of discrete length on a single disjunctive resource [14]. The problem of finding a (classic, not induced) matching of maximum cardinality in convex bipartite graphs has been studied extensively [10, 22, 9] culminating in an algorithm when a compact representation of the graph is given [22]. Several other combinatorial problems have been studied in convex bipartite graphs. While some problems have been shown to be -complete even if restricted to this graph class [1], many problems that are -hard in general can be solved efficiently in convex bipartite graphs. For example, a maximum independent set can be found in  time (assuming a compact representation) [20] and the existence of Hamiltonian cycles can be decided in  time [18]. For a comprehensive summary we refer to [13].

One of the applications given by Stockmeyer and Vazirani [23] for the induced matching problem can be stated as follows. We want to test (or use) a maximum number of connections between receiver-sender pairs in a network. However, testing a particular connection produces noise so that no other node in reach may be tested simultaneously. We remark that this type of motivation extends very naturally to convex bipartite graphs when we consider wireless networks in which nodes broadcast or receive messages in specific frequency ranges. Further, weighted edges can model the importance of connections.

Previous Work.

Yu, Chen and Ma [24] describe an algorithm that finds both a maximum-cardinality induced matching and a minimum chain cover in a convex bipartite graph in runtime . Their procedure is improved by Brandstädt, Eschen and Sritharan [3], resulting in a runtime of . Chang [6] computes maximum-cardinality induced matchings and minimum chain covers in  time in bipartite permutation graphs, which form a proper subclass of convex bipartite graphs. Recently, Pandey, Panda, Dane and Kashyap [19] gave polynomial algorithms for finding a maximum-cardinality induced matching in circular-convex and triad-convex bipartite graphs. These graph classes generalize convex bipartite graphs.

Our Contribution.

We improve the previous best  algorithm [3] for maximum-cardinality induced matching and minimum chain covers in convex bipartite graphs in several ways. In Section 2 we give an algorithm for finding maximum-weight induced matchings in convex bipartite graphs with  runtime. The weighted problem has not been considered before. In Section 3 we specialize our algorithm to find induced matchings of maximum cardinality in  runtime, given a compact representation of the graph. In Section 4 we extend this approach to obtain in  time a compact representation of a minimum chain cover. If no compact representation is given, our approach is easily adapted to produce a minimum chain cover in  time.

All of our algorithms achieve optimal running time for the respective problem and model. Our results for finding a maximum-cardinality induced matching also improve the running times of the algorithms of Pandey et al. [19] for the circular-convex and triad-convex case, as they use the convex case as a building block.

2 Maximum-Weight Induced Matchings

In this section, we compute a maximum-weight induced matching of a given edge-weighted convex bipartite graph  in time . We generally write indices as superscripts and indices as subscripts. We consider as a subset of . We assume that is numbered as described in Section 1 and the interval of each vertex  is given by the pair (,) of the left and right endpoint. Each edge  has a weight .

Our dynamic-programming approach considers the following subproblems: For an edge , we define as the cost of the maximum-weight induced matching that uses the edge and contains only edges in . The following dynamic-programming recursion computes :

(1)

The range over which the maximum is taken is illustrated in Figure 2.

Fig. 2: The table entries that go into the computation of are shaded: They lie in rows that end to the left of (marked by an arrow), and only the entries to the left of are considered.

In this recursion, we build the induced matching of weight by adding the edge to some induced matching  of weight . We want to be an induced matching: By construction, the edge is independent of , but we have to show that the other edges of are also independent of . In order to prove this (Lemma 2), we use a transitivity relation between independent edge pairs.

Observation 1.

Two edges and are independent if and only if and .

Lemma 1.

Let with . Assume that and are independent, and and are independent. Then and are independent.

Proof.

By Observation 1, we have and . Thus, and . ∎

Lemma 2.

The recursion (1) is correct.

Proof.

By Observation 1, any edge with that is independent of  satisfies and . By Lemma 1, all other edges  used to obtain the matching value  are also independent of . ∎

We create a table in which we record the entries . We assume that the intervals are sorted in nondecreasing order by , that is, for . The values form the -th row of the table. We fill the table row by row proceeding from  to . Each row  is processed from left to right.

The only challenge in evaluating (1) is the maximum-expression, for which we introduce the notation .

We discuss the computation of the leftmost entry later. When we proceed from  to  we want to go incrementally from to . Direct comparison of the respective defining sets leads to

(2)
Fig. 3: Example. We are in the process of filling row from left to right. All rows with smaller index have been processed and are filled with the entries . Unprocessed entries are marked as “–”. The figure does not show the rows in the order in which they are processed, but intervals with the same right endpoint are grouped together. The bold entries collect the provisional maxima in each group. By way of example, the encircled entry is the maximum among the shaded entries of the intervals that end at , ignoring the yet unprocessed entries. As we proceed from to in row , the intervals with become relevant. The maximum usable entry from these intervals is found in position 17 of this array, because . The entry is marked by an arrow. The next entry will be computed as . (Some of these entries might not exist.) We can observe that the minimum over which is defined involves no unprocessed entries (Lemma 3). When the next row in the group with is later filled, it will be necessary to update .

In order to evaluate the maximum of the second set in (2) efficiently, we group intervals with a common right endpoint together. Let be the earliest startpoint of an interval with endpoint . If there are no intervals with endpoint , we set . (It would be more logical to set in this case, but this choice makes the algorithm simpler.) We maintain an array for that is defined as follows:

In a sense, is a provisional version of the expression , which takes into account only the already processed rows. For (2), we need the entry , and we will see that all relevant entries have already been computed whenever we access this entry. Thus, we rewrite (2):

(3)

The condition ensures that the array index does not exceed the left boundary of the array . Also, the index never exceeds the right boundary of the array , since , and therefore . Thus, is always defined when it is accessed.

Lemma 3.

When entry is processed, (2) and (3) define the same quantity .

Proof.

We distinguish three cases.

Case 1: No interval ends at , and accordingly, .

In this case in (2) since its rightmost set is empty. Since , we have and, thus, the right side of (3) evaluates also to .

Case 2: There exists an interval ending at , and . The right side of (3) evaluates to . In (2), intervals that end at have . Thus, an edge  with  and  does not exist, and the second set in (2) is empty. Therefore, (2) evaluates to .

Case 3: There exists an interval ending at , and . In this case, is defined:

(4)

For each entry with , we conclude that  and, thus, row has already been processed. This means that the condition that row was processed is redundant, and (4) coincides with the right side of (2). ∎

After processing row  with startpoint  and endpoint , we have to update the values in . This is straightforward. Figure 3 illustrates the role of the arrays when processing a row.

It remains to discuss the computation of the first value  of the row. An edge  and edge are independent if and only if the interval  ends before , that is . Since we process the intervals in nondecreasing order by their startpoints, it suffices to maintain a value  with the maximum  in all finished intervals: those intervals that end before . In other words . This value is easily maintained by updating  as  increases. The full details are stated as Algorithm 1.

The update of the array in the second loop can be integrated with the computation of in the first loop. When this is done, the values need not be stored at all because they are not used. As stated earlier, when no interval ends at a point , we set . The array  consists of a single dummy entry . This way we avoid having to treat this special cases during the algorithm.

Preprocessing:
for  to  do
       Find startpoint of the longest interval with endpoint Create an array and initialize it to 0. (If there is no such interval with endpoint , set and create an array with a single dummy entry that will remain at 0.)
Main program:
maximum entry in finished intervals for  to  do
      
       for all rows with  do Process each interval that starts at
             Process the -th interval and fill row of the table:
             will be the current value of leftmost entry for  to  do compute successive entries
                   if  then
                        
                  
             Go through the computed entries again to update the array :
             the row maximum so far for  to  do
                  
            
       update as is incremented
return the maximum weight of an induced matching
Algorithm 1 Weighted Maximum Matching

We have described the computation of the value of the optimal matching. It is straightforward to augment the program so that the optimal matching itself can be recovered by backtracking how the optimal value was obtained, but this would clutter the program.

Theorem 1.

A maximum-weight induced matching of an edge-weighted convex bipartite graph can be computed in time.

3 Maximum-Cardinality Induced Matchings

For the unweighted version of the problem, we assume a compact representation of a convex bipartite graph , that is, for each  we are given the startpoint  and endpoint  of its interval . This makes it possible to obtain a linear runtime of .

The recursion (1) can be specialized to the unweighted case by setting .

(5)

This recursion has already been stated in [24] and [3] in a slightly different formulation. Yu, Chen and Ma [24] describe it as a greedy-like procedure that “colors” the edges of a bipartite graph with the values . From this coloring, they obtain both a maximum-cardinality induced matching and a minimum chain cover. The original implementation given in [24] runs in time . Brandstädt, Eschen and Sritharan [3] give an improved implementation of the coloring procedure with runtime . Our Algorithm 1 from Section 2 obtains the values  in total time .

Given a compact representation, we can exploit some structural properties of the filled dynamic-programming table to further improve the runtime to . The following observations were first given in [24] and [3].

Lemma 4 ([24, Lemma 5]).

The values are nondecreasing in each row.

Proof.

This is obvious from (5), since the set over which the maximum is taken increases with . ∎

Lemma 5 ([3, Lemma 3.3, Lemma 3.4]).

Each row contains at most two consecutive values.

Proof.

Let be the largest value in some row . Then, if we take a corresponding matching of size , it is easy to see that we can remove the last two edges and replace them by an arbitrary edge . This proves that .

More formally, we can argue by the recursion (5): Assume there are values  in row . By Lemma 4 we can assume . By (5),  with  for some . Thus, and by definition of  according to (5) we have , which is a contradiction. ∎

Set for  to  do
       for all rows with  do Process each interval that starts at
             leftmost entry
             leftmost endpoint of a row that contains an entry with if  then There are two values and in this row:
                    for
                    for
                   The largest entry is .
                  
            else The same entry is used for the whole row.
                   The largest entry is .
                  
            
        update as advances
      
return
Algorithm 2 Unweighted Maximum Matching, initial version
Set The value acts like .
Set for  to  do
       for all rows with  do Process each interval that starts at
             leftmost entry
             is no longer computed from scratch
             if  then There are two values and in this row:
                    for ,
                    for .
                   The largest entry is .
                  
            else The same entry is used for the whole row.
                   The largest entry is .
                  
            
       update as is incremented
       for all entries in column  do
             ;
      
return
Algorithm 3 Unweighted Maximum Matching, second version

Specializing Algorithm 1 to the unweighted case leads to a solution with running time. Our -time algorithm will follow the general scheme of Algorithm 1, with the following modifications.

  • In view of Lemmas 4 and 5, we will not fill each row individually, but we will just determine the leftmost value and the position where the entries switch from to (if any).

  • The computation of the leftmost entry is exactly as in Algorithm 1.

  • The position where the entries of row switch from to can be determined from (5): If there is a row containing an entry left of , then must be as soon as . The algorithm determines the threshold position as the smallest right endpoint under these constraints. Then the entries in row start at if these entries are still part of the row.

  • We do not maintain the whole array for each , but only its last entry ; this is sufficient for updating and thus for computing the leftmost entries in the rows. We call this value .

This leads to Algorithm 2.

We will improve Algorithm 2 by maintaining the values instead of computing them from scratch. We use the fact that the smallest value in the row is known, and hence we can associate with the value instead of the row index , as is already apparent from our chosen notation. We update whenever increases. The details are shown in Algorithm 3. The differences to Algorithm 2 are marked by .

This still does not achieve running time. The final improvement comes from realizing that it is sufficient to update when is the leftmost entry in row . The time when such an update occurs can be predicted when a row is generated. To this end, we maintain a list for that records the updates that are due when becomes . This final version is Algorithm 4.

Initialize lists to empty lists Set Set for  to  do
       for all rows with  do Process each interval that starts at
             leftmost entry
             if  then There are two values and in this row:
                    for ,
                    for .
                   add to the list don’t forget to update when reaches
                   add to the list add to the list don’t forget to update when advances
                  
            else The same entry is used for the whole row.
                   add to the list
            
       update as advances
       for all  do  perform the necessary updates
return
Algorithm 4 Unweighted Maximum Matching, final version

The runtime of Algorithm 4 is : Processing each interval takes constant time and adds at most two pairs to the lists . Thus, processing the lists for updating the array takes also only time.

Some simplifications are possible: The addition of to the list in the case of two values can actually be omitted, as it leads to no decrease in : is already . The algorithm could be further streamlined by observing that at most two consecutive values of need to be remembered at any time.

Again, it is easy to modify the algorithm to return a maximum induced matching in addition to its size.

Theorem 2.

Given a compact representation, a maximum-cardinality induced matching of a convex bipartite graph can be computed in time.

4 Minimum Chain Covers

In convex bipartite graphs, the size of a maximum-cardinality induced matching equals the number of chain subgraphs of a minimum chain cover [24]. In this section we use this duality and extend our Algorithm 4 to obtain a minimum chain cover of a convex bipartite graph .

Let  be the cardinality of a maximum induced matching of . Accordingly, the values  cover the range . We create  chain subgraphs  of . The edges with  will be part of the chain subgraph .

As already observed in [24], the edges with a fixed value of may contain independent edges and, thus, do not necessarily constitute a chain graph. Accordingly, Yu, Chen, and Ma [24] describe a strategy to extend the edge set for each value of  to a chain graph . Their original implementation runs in time . Brandstädt, Eschen, and Sritharan [3] give an improved implementation with runtime . We implement their strategy in  time, given a compact representation. The correctness was already shown in [24]. We give a new independent proof. The following characterization is often used as an alternative definition of chain graphs:

Lemma 6.

A bipartite graph is a chain graph if and only if the sets of neighbors of the vertices form a chain in the inclusion order. (Equal sets are allowed.) In other words, among any two sets and , one must be contained in the other.

Proof.

This is a direct consequence of the fact that edges and are independent if and only if and . ∎

The condition that the neighborhoods must form a chain is apparently the reason for calling these graphs chain graphs, however, we did not find a reference for this.

We use  to denote the set of rows that contain entries . For every row , we determine the beginning and ending points  with this color, that is, . We extend every such interval  to the left by choosing a new starting point  according to the formula

(6)
(7)

The second expression uses the new values on the right-hand side. It is easy to see that the two expressions are equivalent: Using (6) for the definition of , the expression (7) becomes

(8)

The third set is contained in the second set, and thus, (8) is equal to  according to (6).

We construct the chain graph as the graph with the extended intervals . Figure 4 shows an example.

Fig. 4: An example showing a section of the computation of by Algorithm 4. The threshold values and are shown as they change with the rows that are successively considered. The shaded entries form the chain subgraph that is used for the chain cover.

It is obvious by construction that these intervals satisfy the conditions of a chain graph: By Lemma 6, we have to show that there are no two intervals , with and . But if the last condition holds, (7) ensures that .

The only thing that could go wrong is that becomes too small so that the chain graph is not a subgraph of . The following lemma shows that this is not the case.

Lemma 7.

for every .

Proof.

For the sake of contradiction, assume . By (6), there is a row  such that  and . Setting and in the recursion (5), we conclude that , because otherwise, (5) would imply . Thus, is an edge of . By Lemma 5, . By (5), there is an edge  with , and . Again by (5), such an edge would imply that , a contradiction. ∎

Algorithm 5 carries out the computation of (6). It processes the triplets in increasing order of the endpoints . This can be done in linear time, by first sorting the triples into buckets according to the value of . Thus, Algorithm 5 takes linear time . By Lemma 6, the result is a chain cover, which by duality is minimum. Each row belongs to at most two chain subgraphs, and thus the chain cover consists of at most such row intervals in total. It is straightforward to extend Algorithm 4 to compute the sets  and the quantities , and thus the cover can be constructed in time in compressed form.

Theorem 3.

Given a compact representation of a convex bipartite graph, a compact representation of a minimum chain cover can be computed in  time.

Given a compact representation of a minimum chain cover, we can list all the edges of its chain subgraphs in  time since every edge is contained in at most two chain subgraphs. As mentioned in the introduction, a compact representation of a convex bipartite graph can be computed in  time [20, 22, 2]. Thus, Algorithm 4 and Algorithm 5 can also be used to obtain:

Theorem 4.

A minimum chain cover of a convex bipartite graph can be computed in  time.

5 Certification of Optimality

An induced matching together with a chain cover of the same cardinality provides a certificate of optimality, of size . As we will establish in the following discussion, it is easy to check this certificate for validity in linear time. This is easier than constructing the largest induced matching with our algorithm. Thus, it is possible to establish correctness of the result beyond doubt, for each particular instance of the problem, without having to trust the correctness of our algorithms and their implementations, see [17] for a survey about this concept.

Let and such that in row , the entries with are those with
Set The value acts like
for  do
       We maintain the quantities for .
       for all  with  do
            
      for all  with  do update for the increment of
            
      
Algorithm 5 Constructing a chain graph ,

It is trivial to check whether the matching is contained in the graph. To test whether it forms an induced matching, we sort the edges by . This takes time with bucket-sort. Then, by Lemma 1, it is sufficient to test consecutive edges for independence, and each such test takes only constant time according to Observation 1.

To establish the validity of a chain cover , we need to check that the edges of  are covered and each is a chain subgraph. The chain subgraphs , for are compactly represented by a set of at most quadruples . The following checking procedure works in linear time for any chain cover as long as it consists of convex bipartite subgraphs. It does not use any special properties of the cover produced by our algorithm.

We sort the quadruples lexicographically. Then it is easy to check the chain graph property using the characterization of Lemma 6: The intervals that belong to a fixed chain graph