Parameterized Complexity of Two-Interval Pattern Problem

02/12/2020 ∙ by Prosenjit Bose, et al. ∙ Carleton University 0

A 2-interval is the union of two disjoint intervals on the real line. Two 2-intervals D_1 and D_2 are disjoint if their intersection is empty (i.e., no interval of D_1 intersects any interval of D_2). There can be three different relations between two disjoint 2-intervals; namely, preceding (<), nested (⊏) and crossing (≬). Two 2-intervals D_1 and D_2 are called R-comparable for some R∈{<,⊏,≬}, if either D_1RD_2 or D_2RD_1. A set 𝒟 of disjoint 2-intervals is ℛ-comparable, for some ℛ⊆{<,⊏,≬} and ℛ≠∅, if every pair of 2-intervals in ℛ are R-comparable for some R∈ℛ. Given a set of 2-intervals and some ℛ⊆{<,⊏,≬}, the objective of the 2-interval pattern problem is to find a largest subset of 2-intervals that is ℛ-comparable. The 2-interval pattern problem is known to be W[1]-hard when |ℛ|=3 and NP-hard when |ℛ|=2 (except for ℛ={<,⊏}, which is solvable in quadratic time). In this paper, we fully settle the parameterized complexity of the problem by showing it to be W[1]-hard for both ℛ={⊏,≬} and ℛ={<,≬} (when parameterized by the size of an optimal solution); this answers an open question posed by Vialette [Encyclopedia of Algorithms, 2008].

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Interval graphs and their generalizations are often used to study problems in resource allocation, scheduling, and DNA mapping. In 2002, Vialette [11] proposed a geometric description of RNA helices in an attempt to improve the understanding of the computational complexity for finding structured patterns in RNA sequences. In particular, Vialette modeled the RNA secondary structure using a set of 2-intervals, which inspired subsequent research (e.g., see [13]) on examining the properties of the geometric graphs arising from such representations. The 2-interval pattern problem, introduced by Vialette [12], is a widely studied pattern, and the main topic of this paper.

A 2-interval is the union of two disjoint intervals on the real line. Two 2-intervals and are disjoint if their intersection is empty; that is, no interval of intersects any interval of . We can define three different relations between two disjoint 2-intervals: one 2-interval lies entirely to the left of the other one (called preceding and denoted by ), one 2-interval is nested within the other one (called nested and denoted by ), and the intervals of the two 2-intervals alternate on the real line (called crossing and denoted by ). See Figure 1(a) for an example; a formal definition is given in Section 2. Two 2-intervals and are -comparable for some if either or . A set of disjoint 2-intervals is -comparable, for some and , if every pair of 2-intervals in are -comparable for some . In the 2-interval pattern problem, we are given a set of 2-intervals and a set , and the objective is to compute a largest subset of 2-intervals that is -comparable. Figure 1(b) illustrates such an example.

The 2-interval pattern problem can model various scenarios in the context of RNA structure prediction. While looking for certain RNA structures, some common approaches to cope with intractability are either to restrict the class of pseudoknots [9]

or to apply heuristics 

[3, 8, 10]. Vialette [12] proposed that one can obtain a relevant set of 2-intervals from an RNA sequence by selecting stable stems, e.g., using a simplified thermodynamic model without accounting for loop energy [10, 12, 15]. Then, the prediction of the RNA structure is equivalent to finding a maximum subset of non-conflicting (i.e., disjoint) 2-intervals.

Figure 1: (a) An example for showing the three possible relations between a pair of 2-intervals; here, the same-colour intervals form a 2-interval. Then, , and . (b) An instance of 2-interval pattern problem with and the 2-intervals are . The 2-intervals form a largest subset that is -comparable.

Related work.

Vialette [12] observed that if , then the 2-interval pattern problem is polynomial-time solvable by reductions to the maximum independent set problem on interval graphs, or to the maximum clique problem on comparability graphs. The running time of these algorithms have been improved since then, and expressed in terms of the number of input 2-intervals and various interval-related parameters such as their lengths or overlap [2]. For the case when = 2, the problem is solvable in polynomial time when  [12]. However, if or , then the problem is known to be -hard, even if the intervals of every 2-interval have unit length [12, 1]. If , i.e., , then the -hardness of the problem follows from the hardness of recognizing 2-interval graphs [14].

The approximability of the -hard models of the 2-interval pattern problem was studied by Crochemore et al. [4]. They gave polynomial-time algorithms for the problem with approximation factors 4 when or , and 6 when . They also showed that the results hold for the weighted case, i.e., when each 2-interval is associated with a weight and the goal is to find a maximum weight subset. These factors are improved to 3 when the intervals of every input 2-interval have unit length [4], where they also considered the case when the 2-intervals are weighted. For (and arbitrary input 2-intervals), Jiang [6] improved the approximation factor to 2 and subsequently to for any  [7].

The problem is -hard when , because in this case, the problem is equivalent to computing a maximum independent set on 2-interval graphs, and the latter is known to be -hard [5]; see Section 3 for more details. For , to the best of our knowledge, the only parameterized result is the work of Crochemore et al. [1] who proved that the problem is fixed-parameter tractable, but only when , the input intervals all have unit length and the tractability is with respect to the forward crossing number: the maximum number of 2-intervals that cross a 2-interval “from the right”.

Our results.

In this paper, we answer a question of Vialette [13] by proving that the 2-interval pattern problem is -hard when and . Our -hardness result is inspired by the reduction of the -independent set problem used by Fellows et al. [5]. Their reduction requires all three relations (i.e., they prove the -hardness of the 2-interval pattern problem when ). Prior to our work, it was known that the complexity of the problem is polynomial when  [2], but it was unknown whether the problem is fixed-parameter tractable (when parameterized by the size of an optimal solution) or it is -hard for and . Hence, our -hardness result fully settles the parameterized complexity of the 2-interval pattern problem.

2 Preliminaries

In this section, we give some definitions and notation that will be used throughout the paper.

A 2-interval is the union of two disjoint intervals on the real line; that is, and the interval lies to the left of the interval . For a pair of disjoint intervals , we write when is to the left of . Two 2-intervals and are disjoint if . Moreover, for two disjoint 2-intervals and , we say that is preceding (resp., nested in, crossing) if (resp., , . We write (resp., , ) when is preceding (resp., nested in, crossing) .

We say that two 2-intervals and are -comparable, for some , if (i) and are disjoint and (ii) either or . Let be a set of 2-intervals on the real line, and let such that . Then, a set is called -comparable if every pair of 2-intervals in are -comparable for some . Given and some , the objective of the 2-interval pattern problem is to compute a largest subset such that is -comparable.

Given a graph and a parameter , the -independent set problem asks whether there is an independent set of size in . Fellows et al. [5] proved that the -independent set problem is -hard on 2-interval graphs when . Our -hardness results are also based on showing reductions from the -Multicoloured Clique Problem, which is known to be -hard [5]. The problem is defined as follows.

Problem: -Multicoloured Clique. Input: A graph , and a vertex-colouring for . Question: Is there a clique of size in such that, for each , there is exactly one vertex in the clique that has colour ?

3 -Hardness

In this section, we prove that the 2-interval pattern problem is -hard when and . Our reduction is inspired by that of Fellows et al. [5]. Let be an instance of the -multicoloured clique problem (we assume w.l.o.g. for our purposes that is a proper colouring111Otherwise, one can remove the edges whose end vertices are coloured with the same colour.). We construct a set of 2-intervals such that has a multicoloured clique of size if and only if contains a set of disjoint 2-intervals that are pairwise comparable in one of the relations in ; the value of will be clear from our construction. We first describe an outline of the construction and the corresponding gadgets. Then, we give the details on how to organize the gadgets on the real line specific to each of the sets and . For a colour , let denote the set of vertices of that have colour . Moreover, for every distinct pair of colours , let denote the set of edges of such that . That is, consists of all the edges whose end vertices are coloured with two distinct colours and .

Outline.

The construction consists of two main types of gadgets: selection and validation. By selection gadgets, we ensure that 2-intervals representing vertices with distinct colours and edges with distinct pairs of colours are selected. By validation gadgets, we ensure that the selected set of 2-intervals are valid in the sense that the selected vertices are actually adjacent in the graph and the selected edges are indeed over the selected set of vertices. We group the 2-intervals corresponding to vertices of the same colour together in a vertex-selection gadget in such a way that any feasible solution for the 2-interval pattern problem will have 2-intervals corresponding to one vertex per vertex-selection gadget. Similarly, we group the 2-intervals corresponding to edges with the same pairs of distinct colours together in a edge-selection gadget such that any feasible solution for the 2-interval pattern problem will have 2-intervals corresponding to one edge with . We will then organize the gadgets on the real line in such a way that any feasible solution will contain 2-intervals that are -comparable.

Given , we associate one 2-interval for each vertex . Moreover, we associate four 2-intervals for each edge : two 2-intervals and for each “direction” of the edge and two 2-intervals and that are undirected. The 2-intervals for “directed” edges will be used for validation, and we will show below how they are constructed. Therefore, the number of 2-intervals of the constructed instance will be . We next give the details of each type of gadgets.

Vertex-selection gadget.

For each colour , we construct a vertex-selection gadget. The gadget has two components, which we denote by and ; see Figure 2 for an illustration. The component has “rows” of intervals, each of which has “columns”; each row corresponds to a vertex of with colour . The intervals in the same column pairwise intersect. Moreover, for the intervals in a fixed column , we assign an offset such that each interval in row intersects the interval that is in column and row ; see Figure 2. The component consists of two columns of intervals, and each columns has rows. Here, we assign an offset such that the interval in the first column and row intersects the interval in the second column and row (see Figure 2).

For each vertex , we associate two 2-intervals and as follows. The first (resp., second) 2-interval (resp., ) is composed of the interval in the first (resp., last) column of that corresponds to and the interval in the first (resp., second) column of that corresponds to . These 2-intervals are illustrated with dashed lines in Figure 2. Each of the remaining columns in corresponds to a colour in . These intervals are later paired with intervals from edge-selection gadgets to form 2-intervals that correspond to “directed” edges. Notice that the intervals of the first column of pairwise intersect, ensuring that at most one 2-interval corresponding to a vertex with colour can appear in any feasible solution. Similarly, for the intermediate intervals of (i.e., the intervals of excluding those in the first and last column), it means that all the edges of a -multicoloured clique with at least one endpoint with colour are incident to the same vertex in .

Figure 2: A vertex-selection gadget and the two 2-intervals and corresponding to a vertex .
Lemma 3.1.

Let be feasible a solution for the 2-interval pattern problem, and consider the vertex-selection gadget corresponding to colour . Moreover, let be the set of 2-intervals such that each 2-interval in has at least one interval in . If , then all the intervals in are selected from the same row of .

Proof.

Since there are columns in the component of , cannot have more than 2-intervals, where each containing at least one interval from . Hence, , This means that must contain exactly one interval from every column of and hence, one from every column of . Consider the interval in the first column of (that is in ) and assume that this interval is in row , for some ; it corresponds to a vertex . We now show that every other interval in must also be in row . Since and is a feasible solution, then the interval in the first column of and row must also be in because these two intervals form one of the two 2-intervals corresponding to . Now, suppose that the interval in that is from the second column of is in row . Clearly, (i.e., lies below ) because otherwise the interval of that is in the first column of would intersect this interval due to the offset. Since and is a feasible solution, must contain the interval in the last column of that is in row (as only these two would form a valid 2-interval while considering ). If , then it is not possible to have exactly one interval from column of in for all because the offset would imply that at least two intervals must intersect in . Therefore, . In the same way, we can show that the subsequent intervals of must also be in row . ∎

Observe that the assignment of two 2-intervals for each vertex and placement of their second intervals in with an offset allowed us to argue that the remaining intervals are also selected from the same row of the vertex-selection gadget. We will use a similar construction to argue the same for edge-selection gadgets. Before we continue, one might wonder why we needed and why could not we have only with one 2-interval for each vertex. Although this would force the selection of remaining intervals from the same row, it is impossible to place such a gadget on the real line while maintaining -comparability. To ensure -comparability, we will need to place and on different parts of the real line, possibly far apart from each other.

Edge-selection gadget.

For each distinct pair of colours , we construct an edge-selection gadget. The gadget has two main components, which we denote by and . The component has rows of intervals each of which corresponds to an edge of such that ; see Figure 3(a). Each row has four columns of intervals; the intervals in the same column pairwise intersect. Moreover, there is an offset such that an interval in column intersects the interval in column that is in the row immediately above it. The component has rows and only two columns. There is also an offset between the intervals similar to the offset defined for the intervals in ; see Figure 3(b). The row in corresponds to an edge if and only if the row in corresponds to the edge .

Figure 3: An illustration of the two components of an edge-selection gadget; namely, (a) and (b) . The two 2-intervals and corresponding to the edge are shown dashed-dotted red.

Recall that for each edge in , we associate four 2-intervals; we next describe the construction of these 2-intervals. Let such that , and . Then, the 2-interval (resp., ) is composed of the interval in the first column (resp., last column) of the row corresponding to in and the first interval (resp., second interval) of the row corresponding to in . See Figure 3 for an illustration. The 2-interval (associated with the “directed” edge ) is composed of the interval in the second column of and the interval in the vertex-selection gadget of that is in the row corresponding to vertex and the column for colour . The 2-interval corresponding to the “directed” edge is constructed in a similar way: it consists of the interval in the third column of and the interval in the vertex-selection gadget of that is in the row corresponding to vertex and the column for colour . Figure 4 illustrates an example for constructing the two 2-intervals corresponding to such “directed” edges. Note that the latter two 2-intervals that correspond to “directed” edges are used for validation: they ensure that if the 2-intervals of a vertex with colour is selected, then all the selected edges with an endpoint of colour are incident to .

Lemma 3.2.

Let be a feasible solution for the 2-interval pattern problem, and consider an edge-selection gadget . If there are four 2-intervals in such that each of them has at least one interval in , then all such four 2-intervals must have intervals from the same row of .

Proof.

The proof uses an argument similar to the one we used in the proof of Lemma 3.1. Suppose that corresponds to edges with colours and , where . Now, consider the gadget . Clearly, can have at most one interval from each column of . Since has four 2-intervals that have at least one interval in , the set contains exactly one interval from each column of . Suppose that the interval of the first column of (that is in ) is at row for some . Notice that this interval forms a 2-interval with the first interval in row of and so that interval must also be in (these two intervals form a valid 2-interval and is a feasible solution). We now show that the interval of the last column of (that is in ) must also be at row . Suppose for the sake of contradiction that it is at a row . First, by construction, this interval forms a 2-interval with the second interval in row of and so that interval must also be in . If , then the second interval in row of intersects with the first interval in row of by the construction and so they cannot both be in —a contradiction. Moreover, if , then cannot contain an interval from both the second and third columns of because at least one of them intersects the interval of that is in either the first or the last column of —a contradiction. Therefore, and so the two intervals in that are in are also from the same row . Finally, the fact that forces the intervals in the second and third columns of (that are in ) to be also from the row . ∎

Figure 4: An illustration of the two 2-intervals corresponding to the “directed” edges and , assuming and . The dashed (red) rectangles shown in gadgets indicate the first and last columns of intervals in the gadget.

By the above constructions, we obtain the set of 2-intervals as

Since we associate each vertex with two 2-intervals and each edge with four 2-intervals, we have . The construction of our gadgets can all be done in -time. In the following, we show the arrangement of the gadgets on the real line specific to each of and . Then, we show that any -multicoloured clique in corresponds to pairwise disjoint 2-intervals of . For brevity, let for the rest of this section.

3.1 Hardness for

We now show how to arrange the gadgets on the real line when . To this end, consider the ordering of colours. We place the gadgets on disjoint regions of the real line from left to right as follows. First, for each pair of distinct colours and , , we place the gadget on the line in this order; that is, we first place the gadgets for all , then the gadgets for all and so on. Then, we place the gadgets () from left to right in the increasing order of . Next, we place the gadgets , in the same order as we placed their corresponding gadgets . Finally, we place the gadgets () in the same order as we placed their corresponding gadgets . See Figure 5 for an example. This forms our instance of the 2-interval pattern problem, where and . Clearly, this arrangement can be done in -time. Moreover, one can verify that any two 2-intervals in this instance are -comparable, where .

Lemma 3.3.

Graph has a -multicoloured clique if and only if the 2-interval pattern problem on has a feasible solution of size with respect to .

Proof.

Suppose that has a -multicoloured clique. For each colour , let be the vertex in the clique with colour . Then, for every colour , we select the two 2-intervals and from the vertex-selection gadget corresponding to . Moreover, for every pair of colours and with , let be the edge in the clique such that and . Then, we select the four 2-intervals and . In this way, we have selected 2-intervals in total. Moreover, by the arrangement of gadgets on the real line, one can verify that this set of 2-intervals is -comparable.

Figure 5: The arrangement of gadgets for .

Consider a set of 2-intervals that is a feasible solution for the 2-interval pattern problem with respect to . First, observe that can have at most one interval from the first column of every vertex-selection gadget. We now show that it must contain at least one such interval from the first column of every vertex-selection gadget. Let (resp., ) be the set of 2-intervals such that each 2-interval in has at least one interval in a vertex-selection gadget (resp., an edge-selection gadget). Moreover, let (resp., ) be the set of 2-intervals such that each 2-interval in (resp., ) has exactly two intervals from the same vertex-selection gadget (resp., the same edge-selection gadget). Observe that because the component of an edge-selection gadget has four columns and no two intervals in can come from the same column of any given . This means that . But, there are exactly vertex-selection gadgets and at most two 2-intervals of can be from the same vertex-selection gadget. Hence, and so . Since there are exactly edge-selection gadgets, it follows that we have exactly four 2-intervals in that come from the same edge-selection gadget. By Lemma 3.2, all the 2-intervals coming form the same edge-selection gadget lie in the same row of the gadget.

On the other hand, because we have vertex-selection gadgets, the component of any vertex-selection gadget has “internal” intervals, and at most one of such internal intervals (per column, per vertex-selection gadget) can be in . Notice that a 2-interval has exactly one interval in a vertex-selection gadget if and only if it has exactly one interval in an edge-selection gadget. Therefore, . Since (or, ) forms a partition of , we must have . That is, there are exactly 2-intervals that have exactly one interval in a vertex-selection gadget and the other interval in an edge-selection gadget. Notice that at most of such 2-intervals can come from the same vertex-selection gadget. Since there are vertex-selection gadgets, there are exactly of them from each vertex-selection gadget. This means that, for each vertex-selection gadget, there are 2-intervals in that come from this gadget. By Lemma 3.1, these 2-intervals all come from the same row of the gadget. Hence, we select the vertices corresponding to these rows. We now claim that they are a feasible solution for the -multicoloured clique. Clearly, each selected vertex has a unique colour. Moreover, take any colour and let be the vertex that we selected with colour . Recall that all the intervals of that come from the vertex-selection gadget are in the same row as that of . There are of them (excluding those corresponding to itself) and each is paired with an interval in an edge-selection gadget corresponding to the pair of colours, for all colours . Therefore, there exists an edge between and every other selected vertex and so the selected vertices are indeed a feasible solution for the -multicoloured clique. ∎

By Lemma 3.3, we have the following theorem.

Theorem 3.1.

The 2-interval pattern problem is -hard when .

3.2 Hardness for

We now show that the 2-interval pattern problem is -hard even when . To this end, we show how to arrange the gadgets on the real line such that any pair of two 2-intervals are -comparable. Then, one can prove a result similar to Lemma 3.3 for , concluding that the problem is -hard even for . Here, we only show the arrangement.

Figure 6: The arrangement of gadgets for .

Consider the ordering of colours. We place the gadgets on disjoint regions of the real line from left to right as follows. First, for each pair of distinct colours and , , we place the gadget on the line in this order; that is, we first place the gadgets for all , then the gadgets for all and so on. Then, we place the gadgets , in the same order as we placed their corresponding gadgets . Next, we place the gadgets () from left to right in the increasing order of . Finally, we place the gadgets () in the same order as we placed their corresponding gadgets . See Figure 6 for an example. This forms our instance of the 2-interval pattern problem, where and . Clearly, this arrangement can be done in -time and one can verify that every of pair of 2-intervals are -comparable.

Theorem 3.2.

The 2-interval pattern problem is -hard when .

4 Conclusion

In this paper, we showed that the 2-interval pattern problem is -hard when and ; hence, fully settling the parameterized complexity of the problem when parameterized by the size of an optimal solution. It would be interesting to examine -algorithms with respect to other parameters such as the maximum number of pairwise intersecting -intervals.

References

  • [1] Guillaume Blin, Guillaume Fertin, and Stéphane Vialette. New results for the 2-interval pattern problem. In

    proceedings of the 15th Annual Symposium on Combinatorial Pattern Matching (CPM 2004), Istanbul,Turkey

    , pages 311–322, 2004.
  • [2] Erdong Chen, Linji Yang, and Hao Yuan. Improved algorithms for largest cardinality 2-interval pattern problem. J. Comb. Optim., 13(3):263–275, 2007.
  • [3] Jih-H. Chen, Shu-Yun Le, and Jacob V. Maizel.

    Prediction of common secondary structures of RNAs: a genetic algorithm approach.

    Nucleic Acids Research, 28(4):991–999, 2000.
  • [4] Maxime Crochemore, Danny Hermelin, Gad M. Landau, Dror Rawitz, and Stéphane Vialette. Approximating the 2-interval pattern problem. Theor. Comput. Sci., 395(2-3):283–297, 2008.
  • [5] Michael R. Fellows, Danny Hermelin, Frances A. Rosamond, and Stéphane Vialette. On the parameterized complexity of multiple-interval graph problems. Theor. Comput. Sci., 410(1):53–61, 2009.
  • [6] Minghui Jiang. A 2-approximation for the preceding-and-crossing structured 2-interval pattern problem. J. Comb. Optim., 13(3):217–221, 2007.
  • [7] Minghui Jiang. A PTAS for the weighted 2-interval pattern problem over the preceding-and-crossing model. In

    proceedings of the First International Conference on Combinatorial Optimization and Applications (COCOA 2007), Xi’an, China

    , pages 378–387, 2007.
  • [8] Jihong Ren, Baharak Rastegari, Anne Condon, and Holger H. Hoos. HotKnots: Heuristic prediction of RNA secondary structure including pseudoknots. RNA, 11:1194–1504, 2005.
  • [9] Elena Rivas and Sean R. Eddy. A dynamic programming algorithm for rna structure prediction including pseudoknots. J. of Molecular Biology, 285(5):2053–2068, 1999.
  • [10] Jianhua Ruan, Gary D. Stormo, and Weixiong Zhang. An iterated loop matching approach to the prediction of RNA secondary structures with pseudoknots. Bioinformatics, 20(1):58–66, 2004.
  • [11] Stéphane Vialette. Pattern matching problems over 2-interval sets. In Alberto Apostolico and Masayuki Takeda, editors, Combinatorial Pattern Matching, 13th Annual Symposium, CPM 2002, Fukuoka, Japan, July 3-5, 2002, Proceedings, volume 2373 of Lecture Notes in Computer Science, pages 53–63. Springer, 2002.
  • [12] Stéphane Vialette. On the computational complexity of 2-interval pattern matching problems. Theor. Comput. Sci., 312(2-3):223–249, 2004.
  • [13] Stéphane Vialette. Two-interval pattern problems. In Encyclopedia of Algorithms - 2008 Edition. 2008.
  • [14] Douglas B. West and David B. Shmoys. Recognizing graphs with fixed interval number is np-complete. Discrete Applied Mathematics, 8(3):295–305, 1984.
  • [15] Jizhen Zhao, Russell L. Malmberg, and Liming Cai. Rapid ab initio RNA folding including pseudoknots via graph tree decomposition. In proceedings of the sixth International Workshop on Algorithms in Bioinformatics (WABI 2006), Zurich, Switzerland, pages 262–273, 2006.