 # On Cycle Transversals and Their Connected Variants in the Absence of a Small Linear Forest

A graph is H-free if it contains no induced subgraph isomorphic to H. We prove new complexity results for the two classical cycle transversal problems Feedback Vertex Set and Odd Cycle Transversal by showing that they can be solved in polynomial time on (sP_1+P_3)-free graphs for every integer s≥ 1. We show the same result for the variants Connected Feedback Vertex Set and Connected Odd Cycle Transversal. We also prove that the latter two problems are polynomial-time solvable on cographs; this was already known for Feedback Vertex Set and Odd Cycle Transversal. We complement these results by proving that Odd Cycle Transversal and Connected Odd Cycle Transversal are NP-complete on (P_2+P_5,P_6)-free graphs.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Graph transversal problems play a central role in Theoretical Computer Science. To define the notion of a graph transversal, let  be a family of graphs, be a graph and be a subset of vertices of . The graph is obtained from  by removing all vertices of  and all edges incident to vertices in . We say that  is an -transversal of  if is -free, that is, if contains no induced subgraph isomorphic to a graph of . In other words, intersects every induced copy of every graph of  in . Let  and  denote the cycle and path on  vertices, respectively. Then  is a vertex cover, feedback vertex set, or odd cycle transversal if  is an -transversal for, respectively, (that is, is edgeless), (that is, is a forest), or (that is, is bipartite).

Usually the goal is to find a transversal of minimum size in some given graph. In this paper we focus on the decision problems corresponding to the three transversals defined above. These are the Vertex Cover, Feedback Vertex Set and Odd Cycle Transversal problems, which are to decide whether a given graph has a vertex cover, feedback vertex set or odd cycle transversal, respectively, of size at most  for some given positive integer . Each of these three problems is well studied and is well known to be NP-complete.

We may add further constraints to a transversal. In particular, we may require a transversal of a graph  to be connected, that is, to induce a connected subgraph of . The corresponding decision problems for the three above transversals are then called Connected Vertex Cover, Connected Feedback Vertex Set and Connected Odd Cycle Transversal, respectively.

Garey and Johnson  proved that Connected Vertex Cover is NP-complete even on planar graphs of maximum degree  (see, for example, [14, 31, 36] for NP-completeness results for other graph classes). Grigoriev and Sitters  proved that Connected Feedback Vertex Set is NP-complete even on planar graphs with maximum degree . More recently, Chiarelli et al.  proved that Connected Odd Cycle Transversal is NP-complete even on graphs of arbitrarily large girth and on line graphs.

As all three decision problems and their connected variants are NP-complete, we can consider how to restrict the input to some special graph class in order to achieve tractability. Note that this approach is in line with the aforementioned results in the literature, where NP-completeness was proven on special graph classes. It is also in line with with, for instance, polynomial-time results for Connected Vertex Cover by Escoffier, Gourvès and Monnot  (for chordal graphs) and Ueno, Kajitani and Gotoh  (for graphs of maximum degree at most  and trees).

Just as in most of these papers, we consider hereditary graph classes, that is, graph classes closed under vertex deletion. Hereditary graph classes form a rich framework that captures many well-studied graph classes. It is not difficult to see that every hereditary graph class  can be characterized by a (possibly infinite) set  of forbidden induced subgraphs. If , say , then  is said to be monogenic, and every graph is said to be -free. Considering monogenic graph classes can be seen as a natural first step for increasing our knowledge of the complexity of an NP-complete problem in a systematic way. Hence, we consider the following research question:

How does the structure of a graph  influence the computational complexity of a graph transversal problem for input graphs that are -free?

Note that different graph transversal problems may behave differently on some class of -free graphs. However, the general strategy for obtaining complexity results is to first try to prove that the restriction to -free graphs is NP-complete whenever  contains a cycle or the claw (the 4-vertex star). This is usually done by showing, respectively, that the problem is NP-complete on graphs of arbitrarily large girth (length of a shortest cycle) and on line graphs, which form a subclass of claw-free graphs. If this is the case, then we are left to consider the case when  does not contain a cycle, implying that  is a forest, and does not contain a claw either, implying that  is a linear forest, that is, the disjoint union of one or more paths.

### 1.1 The Graph H Contains a Cycle or Claw

It follows from Poljak’s construction  that Vertex Cover is NP-complete on graphs of arbitrarily large girth. Hence, Vertex Cover is NP-complete on -free graphs if  contains a cycle. However, Vertex Cover becomes polynomial-time solvable when restricted to claw-free graphs [25, 32]. In contrast, the other five problems Connected Vertex Cover, (Connected) Feedback Vertex Set and (Connected) Odd Cycle Transversal are all NP-complete on graphs of arbitrarily large girth and on line graphs; see Table 1. Hence, for these five problems, it remains to consider only the case when  is a linear forest.

### 1.2 The Graph H Is a Linear Forest

In this paper, we focus on proving new complexity results for Feedback Vertex Set, Connected Feedback Vertex Set, Odd Cycle Transversal and Connected Odd Cycle Transversal on -free graphs. It follows from Section 1.1 that we may assume that  is a linear forest. Below we first discuss the known polynomial-time solvable cases. As we will use algorithms for Vertex Cover and Connected Vertex Cover as subroutines for our new algorithms, we include these two problems in our discussion.

For every , Vertex Cover (by combining the results of [1, 34]) and Connected Vertex Cover  are polynomial-time solvable on -free graphs.111The graph is the disjoint union of graphs  and  and  is the disjoint union of  copies of ; see Section 2. Moreover, Vertex Cover is also polynomial-time solvable on -free graphs, for every  , as is the case for Connected Vertex Cover on -free graphs . Their complexity on -free graphs is unknown for and , respectively.

Both Feedback Vertex Set and Odd Cycle Transversal are polynomial-time solvable on permutation graphs , and thus on -free graphs. Recently, Okrasa and Rzążewski  proved that Odd Cycle Transversal is NP-complete on -free graphs. A small modification of their construction yields the same result for Connected Odd Cycle Transversal. The complexity of Feedback Vertex Set and Connected Feedback Vertex Set is unknown when restricted to -free graphs for . For every , both problems and their connected variants are polynomial-time solvable on -free graphs , using the price of connectivity for feedback vertex set [2, 21].222The price of connectivity concept was introduced by Cardinal and Levy  for vertex cover; see also, for example, [6, 7, 8].

### 1.3 Our Results

In Section 3 we prove that Connected Feedback Vertex Set and Connected Odd Cycle Transversal are polynomial-time solvable on -free graphs, just as is the case for Feedback Vertex Set and Odd Cycle Transversal. In Section 4 we prove that for every , these four problems are all polynomial-time solvable on -free graphs; see also Table 1. Finally, in Section 5, we show that Odd Cycle Transversal and Connected Odd Cycle Transversal are NP-complete on -free graphs, that is, graphs that are both -free and -free.

To prove our polynomial-time results, we rely on two proof ingredients. The first one is that we use known algorithms for Vertex Cover and Connected Vertex Cover restricted to -free graphs as subroutines in our new algorithms. The second is that we consider the connected variant of the transversal problems in a more general form. For Connected Vertex Cover this variant is defined as follows:

.99 Connected Vertex Cover Extension

[2pt]     Instance: a graph , a subset and a positive integer . Question: does  have a connected vertex cover  with and ?

Note that Connected Vertex Cover Extension becomes the original problem if . We define the problems Connected Feedback Vertex Set Extension and Connected Odd Cycle Transversal Extension analogously. We will prove all our results for connected feedback vertex sets and connected odd cycle transversals for the extension versions. These extension versions will serve as auxiliary problems for some of our inductive arguments, but this approach also leads to slightly stronger results.

###### Remark 1

For any connected extension variant of these problems on -transversals, we may assume that the input graph  is connected. If it is not, then either all but at most one connected component of  is -free and does not intersect , in which case it need not be considered, or the answer is immediately no. It is easy to check -freeness for the three problems we consider.

###### Remark 2

Note that one could also define extension versions for any original transversal problem (that is, where there is no requirement for the transversal to be connected). However, such extension versions will be polynomially equivalent. Indeed, we can solve the extension version on the input by considering the original problem on the input and adding  to the solution. However, due to the connectivity condition, we cannot use this approach for the connected variants.

###### Remark 3

It is known that Vertex Cover is polynomial-time solvable on -free graphs whenever this is the case on -free graphs. This follows from a well-known observation, see, for example, : one can solve the complementary problem of finding a maximum independent set in a -free graph by solving this problem on each -free graph obtained by removing a vertex and all of its neighbours. However, this trick does not work for Connected Vertex Cover. Moreover, it does not work for Feedback Vertex Set and Odd Cycle Transversal and their connected variants either.

## 2 Preliminaries

Let be a graph. For a set , we write  to denote the subgraph of  induced by . We say that  is connected if  is connected. We write to denote the graph . A subset  is a dominating set of  if every vertex of is adjacent to at least one vertex of . An edge  of a graph is dominating if is a dominating set. The complement of  is the graph . The neighbourhood of a vertex is the set and for , we let . We omit the subscript when there is no ambiguity. We denote the degree of a vertex by .

Let be a graph and let . Then  is a clique if the vertices of  are pairwise adjacent and an independent set if the vertices of  are pairwise non-adjacent. A graph is complete if its vertex set is a clique. We let  denote the complete graph on  vertices. Let with . Then  is complete to  if every vertex of  is adjacent to every vertex of , and  is anti-complete to  if there are no edges between  and . In the first case, we also say that  is complete to  and in the second case anti-complete to .

A graph is bipartite if its vertex set can be partitioned into at most two independent sets. A bipartite graph is complete bipartite if its vertex set can be partitioned into two independent sets  and  such that is complete to . If or has size , the complete bipartite graph is said to be a star. Note that every edge of a complete bipartite graph is dominating.

Let  and  be two vertex-disjoint graphs. The union operation creates the disjoint union of  and , that is, the graph with vertex set and edge set . We denote the disjoint union of  copies of  by . The join operation adds an edge between every vertex of  and every vertex of . A graph  is a cograph if  can be generated from  by a sequence of join and union operations. A graph is a cograph if and only if it is -free (see, for example, ).

The following lemma is well known, but we include a short proof for completeness.

###### Lemma 1

Every connected -free graph on at least two vertices has a spanning complete bipartite subgraph which can be found in polynomial time.

###### Proof

Let  be a connected -free graph on at least two vertices. Then is the join of two graphs  and . Hence, has a spanning complete bipartite subgraph with partition classes  and . Note that this implies that  is disconnected. In order to find a (not necessarily unique) spanning complete bipartite subgraph of  with partition classes  and  in polynomial time, we put the vertices of one connected component of  in  and all the other vertices of  in .∎

Grzesik et al.  gave a polynomial-time algorithm for finding a maximum independent set of a -free graph in polynomial time. As the complement of every independent set  of a graph  is a vertex cover, their result implies that Vertex Cover is polynomial-time solvable on -free graphs. Using the folklore trick mentioned in Remark 3 (see also, for example, [24, 27]) their result can also be formulated as follows.

###### Theorem 2.1 ()

For every , Vertex Cover can be solved in polynomial time on -free graphs.

We recall also that Connected Vertex Cover is polynomial-time solvable on -free graphs . We will need the extension version of this result. Its proof is based on a straightforward adaption of the proof for Connected Vertex Cover on -free graphs .333See Appendix 0.A, where we include a proof for reviewing purposes.

###### Theorem 2.2 ()

For every , Connected Vertex Cover Extension can be solved in polynomial time on )-free graphs.

## 3 The Case H=P4

Recall that Brandstädt and Kratsch  proved that Feedback Vertex Set and Odd Cycle Transversal can be solved in polynomial time on permutation graphs, which form a superclass of the class of -free graphs. Hence, we obtain the following proposition.

###### Proposition 1 ()

Feedback Vertex Set and Odd Cycle Transversal can be solved in polynomial time on -free graphs.

In this section, we prove that the (extension versions of the) connected variants of Feedback Vertex Set and Odd Cycle Transversal are also polynomial-time solvable on -free graphs. We make use of Proposition 1 in the proofs.

###### Theorem 3.1

Connected Feedback Vertex Set Extension can be solved in polynomial time on -free graphs.

###### Proof

Let be a -free graph on  vertices and let  be a subset of . By Remark 1, we may assume that  is connected. By Lemma 1, in polynomial time we can find a spanning complete bipartite subgraph , and we note that, by definition, every edge in  is dominating. Below, in Step 3, in polynomial time we compute a smallest connected feedback vertex set of  that contains  and intersects both  and . In Step 3, in polynomial time we compute a smallest connected feedback vertex set of  that contains and that is a subset of either  or  (if such a set exists). Then the smallest set found is a smallest connected feedback vertex set of  that contains .

Step 1. Compute a smallest connected feedback vertex set  of such that , and .
We perform Step 3 as follows. Consider two vertices and . We shall describe how to find a smallest connected feedback vertex set of that contains . We find a smallest feedback vertex set  in . As is -free, this takes polynomial time by Proposition 1. Then is a smallest feedback vertex set of that contains and is connected, since  is a dominating edge. By repeating this polynomial-time procedure for all  possible choices of  and , we will find  in polynomial time.

Step 2. Compute a smallest connected feedback vertex set  of such that or .
For Step 3 we describe only the case, as the case is symmetric. Thus we may assume that , otherwise no such set exists. Clearly, we may also assume that  contains no cycles. If  contains an edge it follows that , otherwise would contain a triangle. Suppose instead that  is an independent set. If , then must be an independent set, otherwise contains a triangle. So  is a smallest connected vertex cover of  that contains . As is -free, we can find such an  in polynomial time by Theorem 2.2. If , then , as otherwise contains a -cycle. Thus, we check, in polynomial time, if there exists a vertex , such that is connected. If so, .∎

###### Theorem 3.2

Connected Odd Cycle Transversal Extension can be solved in polynomial time on -free graphs.

###### Proof

We only provide an outline, as the proof follows that of Theorem 3.1. We perform the same two steps. In Step 3, we need to find a smallest odd cycle transversal  in and can again apply Proposition 1. In Step 3, we again note that if  contains an edge, then . Suppose that  is an independent set. Then contains no odd cycles if and only if is independent, so  is a smallest connected vertex cover of  that contains . (That is, the case from the proof of Theorem 3.1 can be used for all values of , as we are no longer concerned with whether might contain cycles of even length.)∎

## 4 The Case H=sP1+P3

In this section, we will prove that Feedback Vertex Set and Odd Cycle Transversal and their connected variants can be solved in polynomial time on -free graphs. We need three structural results. First, let us define a function  on the non-negative integers by . We will use this function  throughout the remainder of this section, starting with the following lemma.

###### Lemma 2

Let be an integer. Let  be a bipartite -free graph. If  has a connected component on at least  vertices, then there are at most  other connected components of  and each of them is on at most two vertices.

###### Proof

First note that the case of the lemma is trivially true, as every connected component of a bipartite -free graph has at most two vertices.

Suppose, for contradiction, that  has a connected component  on at least vertices and a connected component  on at least three vertices. As  is bipartite and contains at least vertices, contains a independent set of  vertices that induce . As  is bipartite and contains at least three vertices, has a vertex  of degree at least , and so  and two of its neighbours induce a . Thus  is not -free, a contradiction.

Similarly, if  contains a connected component  on at least vertices, then this component contains an induced . Since  is -free, can contain at most  connected components other than . ∎

The internal vertices and leaves of a tree are the vertices of degree at least  and degree , respectively.

###### Lemma 3

Let be an integer. Let  be an -free tree. Then  has at most  internal vertices.

###### Proof

Let  be the set of internal vertices of . Suppose that . We will show that this leads to a contradiction. As a path with at least internal vertices contains an induced , we may assume that  is not a path and so has at least three leaves. Hence .

Let  and  be the two bipartition sets of , and assume without loss of generality that . For , let  and  be the leaves and internal vertices of  that belong to . If there is a vertex in  of degree at least  that is anti-complete to a set of  vertices of , then  contains an induced , a contradiction. Therefore we may assume that every vertex of  either has degree at least or is in . Then

 |X|+|UY|+|LY|−1 = |X|+|Y|−1 = |V(T)|−1 = |E(T)| = ∑v∈Ydeg(v) ≥ ∑v∈UY(|X|−s+1)+|LY| = (|X|−s+1)|UY|+|LY| = |X||UY|−s|UY|+|UY|+|LY|.

Thus we have and we rearrange to see that

 |UY|≤|X|−1|X|−s=1+s−1|X|−s.

Since , we have that . First suppose . Then and , or and . Both cases contradict the assumption that  has at least vertices. Now suppose . Then, by our assumption that , we have that and so . Now it is easy to find an induced (see Figure 1), and this contradiction completes the proof.∎

The bound of  in Lemma 3 is not tight but, as we shall see later, it suffices for our purposes.

###### Lemma 4

Let be an integer. Let be a connected -free graph, and let be a set of vertices in . Then there is a set of vertices in such that is connected and .

###### Proof

If  is connected, then let . Otherwise, since  cannot now be a complete graph, it contains an induced path  on three vertices in . The number of connected components of  that do not contain a vertex that is either in  or adjacent to a vertex of  in  is at most , otherwise  contains an induced . Let  contain the vertices of  and the internal vertices of shortest paths in  from  to each set of vertices that induces a connected component of . As at most of these shortest paths have more than zero internal vertices, and as each contains at most  internal vertices (any longer path contains an induced ), it follows that . As  is connected, the lemma is proved. ∎

We now prove our four results. For the connected variants, we consider the more general extension versions.

###### Theorem 4.1

For every , Feedback Vertex Set can be solved in polynomial time on -free graphs.

###### Proof

Let be an integer, and let be an -free graph. We must show how to find a smallest feedback vertex set of . We will in fact show how to find a largest induced forest of , the complement of a smallest feedback vertex set. The proof is by induction on . If , then we can use Proposition 1. We now assume that and that we have a polynomial-time algorithm for finding a largest induced forest in -free graphs. Our algorithm performs the following two steps in polynomial time. Together, these two steps cover all possibilities.

Step 1. Compute a largest induced forest  such that every connected component of  has at least  vertices.
By Lemma 2 we know that  will be connected, and so by Lemma 3 will be a tree with at most  internal vertices. We consider every possible choice  of a non-empty set of at most  vertices. There are  choices. If  induces a tree, we will find a largest induced tree whose internal vertices all belong to . This can be found by adding to  the largest possible set of vertices that are independent and belong to the set  of vertices in that each have exactly one neighbour in . That is, we need a largest independent set in  and, by Theorem 2.1, such a set can be found in polynomial time.

Step 2. Compute a largest induced forest  such that  has a connected component with at most vertices.
We consider every possible choice of a non-empty set  of at most vertices and discard those that do not induce a tree. There are choices for . Let , and let . Then  is -free. Thus we can find a largest induced forest  of  in polynomial time and is a largest induced forest of  among those that have  as a connected component. ∎

###### Theorem 4.2

For every , Connected Feedback Vertex Set Extension can be solved in polynomial time on -free graphs.

###### Proof

There are similarities to the proof of Theorem 4.1, but more arguments are needed. Let be an integer, let be a connected -free graph and let  be a subset of . We must show how to find a smallest connected feedback vertex set of  that contains  in polynomial time. We show how to solve the complementary problem in polynomial time: how to find a largest induced forest  of  that does not include any vertex of  and is connected. We will say that an induced forest  is good if it has these two properties.

Our algorithm performs the following three steps in polynomial time. Together, these three steps cover all possibilities.

Step 1. Compute a largest good induced forest  such that there is a connected component of  that has at least  vertices.
By Lemma 2 we know that  has exactly one connected component on at least  and there are at most other connected components of , each on at most two vertices. By Lemma 3, the connected component on at least  vertices has at most  internal vertices. We consider  choices of a non-empty set  of at most  vertices that induces a tree and a set  of at most vertices that induces a disjoint union of vertices and edges such that does not intersect , is disjoint from  and no vertex of  has a neighbour in . Let  be the set of vertices that each have exactly one neighbour in  and no neighbour in , but do not belong to . We then add to  the largest possible set  of vertices that are independent and belong to the set  such that is connected. This is achieved by taking the complement of the smallest connected vertex cover of  that contains . By Theorem 2.2, this can be done in polynomial time.

Step 2. Compute a largest good induced forest  such that  has at most connected components and each connected component has at most vertices.
Since the number of vertices in  is bounded by the constant , we can simply check all sets containing at most that many vertices to see if they induce such a good forest.

Step 3. Compute a largest good induced forest  such that  has at least  connected components and each connected component has at most vertices.
We consider choices of a non-empty set  of at most vertices. We reject  unless  is a good induced forest on  connected components with no connected component of more than vertices. Assuming our choice of  is correct, the connected components of  will become connected components of .

Let and note that no vertex of  is in . If is a good forest, then we are done. Otherwise we consider every set  of at most vertices of such that is connected; see also Figure 2. We note that if there is a largest induced forest  such that the connected components of  are also connected components of , then Lemma 4 applied to implies that such a set  exists.

Let . If is a forest, then we are done. Otherwise note that is the disjoint union of one or more complete graphs: cannot contain an induced , as it is anti-complete to  which contains an induced .

As  is connected, each of the complete graphs in contains at least one vertex that is adjacent to some vertex of . Hence in polynomial time we can find a set  of vertices containing all but vertices from each of the complete graphs  in such a way that is connected. Then is a largest good induced forest that contains  and no vertex of .

After considering each of the choices for , in polynomial time we find a largest good induced forest that contains  and no vertex of . After considering each of the choices for , we find in polynomial time a largest good induced forest that has at least  connected components, each with at most vertices. ∎

###### Theorem 4.3

For every , Odd Cycle Transversal can be solved in polynomial time on -free graphs.

###### Proof

Let be an integer, and let be an -free graph. We must describe how to find a smallest odd cycle transversal of . If , then we can use Proposition 1. We now assume that and use induction. We will in fact describe how to solve the complementary problem and find a largest induced bipartite subgraph of . The proof is by induction on  and our algorithm performs two steps in polynomial time, which together cover all possibilities.

Step 1. Compute a largest induced bipartite subgraph  such that every connected component of  has at least  vertices.
By Lemma 2, we know that  will be connected. Hence, has a unique bipartition, which we denote . We first find a largest induced bipartite subgraph  that is a star: we consider each vertex  and find a largest induced star centred at  by finding a largest independent set in . This can be done in polynomial time by Theorem 2.1.

Next, we find a largest induced bipartite subgraph  that is not a star. We consider each of the  choices of edges  of  and find a largest induced connected bipartite subgraph  such that and and neither  nor  has degree  in  (since  is not a star, it must contain such a pair of vertices). Note that the number of vertices in  non-adjacent to  is at most , otherwise  induces an . Similarly there are at most vertices in  non-adjacent to . We consider each of the  possible pairs of disjoint sets  and , which are each independent sets of size at most such that is anti-complete to . We will find a largest induced bipartite subgraph with partition classes  and  such that and and every vertex in is adjacent to  and every vertex in is adjacent to . That is, we must find a largest independent set in both and ; see Figure 3 for an illustration. This can be done in polynomial time, again by applying Theorem 2.1.

Step 2. Compute a largest induced bipartite subgraph  such that  has a connected component with at most vertices.
We consider each of the  possible choices of a non-empty set  of at most vertices and discard those that do not induce a bipartite graph. We will find the largest  that has  as a connected component. Let , and let . As  is -free, we can find a largest induced bipartite subgraph  of  in polynomial time and is a largest induced bipartite subgraph among those that have  as a connected component. ∎

###### Theorem 4.4

For every , Connected Odd Cycle Transversal Extension can be solved in polynomial time on -free graphs.

###### Proof

Let be an integer, let be a connected -free graph and let  be a subset of . We must describe how to find a smallest connected odd cycle transversal of  that contains . We will solve the complementary problem: how to find a largest induced bipartite graph of  that does not include any vertex of  and whose complement is connected. We will say that an induced bipartite graph  is good if it has these two properties. Our algorithm consists of three steps, which can each be performed in polynomial time and which together cover all the possible cases.

Step 1. Compute a largest good induced bipartite subgraph  such that  has a bipartition in which one set, say , has size . (Note that this includes the case when every connected component of  has at most two vertices and  has at most  connected components.)
We consider choices of an independent set  of at most  vertices of  that does not intersect . We wish to find , the largest possible independent set in such that is connected. By Theorem 2.2, we can do this in polynomial time by computing a minimum connected vertex cover of that contains  and taking its complement (in ).

Step 2. Compute a largest good induced bipartite subgraph  such that  has at least  connected components and each connected component has at most two vertices.
Note that . The algorithm mimics Step 4 of the algorithm in the proof of Theorem 4.2, but checks for a good bipartite graph instead of a good forest.

Step 3. Compute a largest good induced bipartite subgraph  such that there is a connected component of  that has at least three vertices and  has a bipartition with and .
It is in this case that we must do most of the work in proving the theorem, and here we will need ideas beyond those already met in this section.

As  contains a connected component on at least three vertices, it will contain an induced  and so and . We consider choices of disjoint independent sets  and  that each contain vertices of  and do not intersect . If contains an induced , our aim is to compute a largest good induced bipartite graph  with bipartition  such that and ; otherwise we discard the choice of .

 U=(N(X′)∩N(Y′))∪W VX=N(X′)∖(Y′∪N(Y′)∪W) VY=N(Y′)∖(X′∪N(X′)∪W) Z=V∖(X′∪Y′∪N(X′)∪N(Y′)∪W)

There are a number of steps where our procedure branches as we consider all possible ways of choosing whether or not to add certain vertices to . Note that assuming our choice of  and  is correct, no vertex of  can be in . If we decide that a vertex will not be in , we will then add it to .

Step 3.1. Reduce  to the empty set.
Notice that  does not contain an independent set on more than vertices otherwise would contain an induced . We consider choices of disjoint independent sets  and  that are each subsets of  and each contain at most vertices. We move the vertices of  and  by adding them to  and , respectively. We move the vertices of by adding them to . If after this process is complete there are vertices in with neighbours in both  and , we move these vertices by adding them to . We note that now:

• is the empty set,

• still contains vertices with neighbours in  but not in ,

• still contains vertices with neighbours in  but not in , and

• contains vertices that will not be in .

So our task is to decide how best to add vertices of  to  and vertices of  to , but first there is another step: as must be connected, and  is a subgraph of , we choose some vertices that will not be in , but will connect together the connected components of . This will not be possible if the vertices of  belong to more than one connected component of . Hence, in that case we discard this choice of .

Step 3.2. Make connected.
We consider choices of sets  of vertices of such that each contains at most vertices. If is connected, we move the vertices of  by adding them to , and so  becomes connected. Note that since all vertices of  are in the same connected component of , Lemma 4 implies that at least one such set  can be found.

Step 3.3. Add vertices from to and from to .
We note that  is -free, as no vertex of  has a neighbour in , , and  is -free. By symmetry, is -free. Thus both  and  are disjoint unions of complete graphs. Note that  can contain at most one vertex from each of these complete graphs. We consider two subcases.

Step 3.3a. Compute a largest good induced bipartite subgraph  with bipartition such that , and contains no edges between  and .
As must be connected, each clique of  and  that contains at least two vertices must contain a vertex adjacent to  (otherwise such a set  cannot exist). Thus we can form  from  by adding to  one vertex from each clique of  and form  by adding to  one vertex from each clique of  in such a way that  is connected. (If we do this, it is possible that will contain an edge from  to , but then this solution is at least as large as one where such edges are avoided.)

Step 3.3b. Compute a largest good induced bipartite subgraph  with bipartition such that , and has an edge where , .
We consider choices of an edge , , . Let be a neighbour of  and note that , and induce a in . Therefore  must be complete to all but at most cliques of . By symmetry, must be complete to all but at most cliques of . A clique in  or  is bad if it is not complete to  or , respectively. Note that the cliques containing  and  may be bad. We move  and  to .

We consider choices of a set  of at most vertices that each belong to a distinct bad clique and move each to  or  if they are in  or  respectively. We move the other vertices of the bad cliques to . If the vertices of  are not in the same connected component of , we discard this choice of . We consider choices of sets  of vertices of such that each contains at most vertices. If is connected we move the vertices of  to , so  becomes connected. Since the vertices of  are in the same connected component of , Lemma 4 implies that at least one such set  can be found.

Note that some cliques might have been completely removed from  and  by the choice of . It only remains to pick one vertex from each remaining clique of  and , and add these vertices to  or , respectively to finally obtain . As all vertices in these cliques are adjacent to  or  we know that will be connected. ∎

## 5 The Case H=P6

In this section we prove that Odd Cycle Transversal and Connected Odd Cycle Transversal are NP-hard on -free graphs. We do this by modifying the construction used in  for proving that these two problems are NP-complete on -free segment graphs.

###### Theorem 5.1

Odd Cycle Transversal and Connected Odd Cycle Transversal are NP-complete on -free graphs.

###### Proof

Both problems are readily seen to belong to NP. To prove NP-hardness we reduce from Vertex Cover, which is known to be NP-complete . Let be an instance of Vertex Cover. Let  and  be the number of vertices and edges, respectively, in . Let be the vertices of . We construct a graph  from  as follows.

1. For create vertices and . Let and be the sets of, respectively, , , , and vertices.

2. For , add the edges  and  (so we make complete to both and ).

We first claim that the following statements are equivalent:

1. has a vertex cover of size at most ;

2. has an odd cycle transversal of size at most ;

3. has a connected odd cycle transversal of size at most .

The implication (iii)  (ii) is trivial. Below we prove (i)  (iii) and (ii)  (i).

(i)  (iii). Suppose that has a vertex cover  of size at most . We define the set

 S=⋃vi∈Q{xi,yi}∪⋃vi∉Q{bi}

and observe that