A graph is a -graph, for two hereditary graph properties , if can be partitioned into two sets such that and . We call a -partition of . The -Recognition problem is to recognize whether a given graph is a -graph. This captures a wealth of famous problems, including the recognition of -colorable, bipartite, co-bipartite, and split graphs, and -Vertex Deletion, which asks for a partition such that and has order at most for some given .222The order of a graph is its number of vertices. In the most interesting (and NP-hard) cases [2, 14, 25], and are both characterized by a (not necessarily finite) set of forbidden connected333 The restriction to connected graphs is probably necessary for NP-hardness: the recognition of
The restriction to connected graphs is probably necessary for NP-hardness: the recognition ofunipolar graphs, where is the set of complete graphs (nonedge-less graphs), can be solved in polynomial time [18, 13, 27, 30]. induced subgraphs. In other words, and are each closed under the disjoint union of graphs in these cases.
Many such -Recognition problems were shown fixed-parameter tractable by Kanj et al. , for example when is the class of graphs that is a disjoint union of cliques, using parameter . The central algorithmic idea that was employed in  is the pushing process. The algorithm empties the input graph, and adds vertices back one by one while maintaining a valid partition. Since adding a vertex might invalidate a previously valid partition, vertices are pushed from one part of the partition to the other part in the hope of obtaining a valid partition again. A similar algorithmic idea, known as iterative localization, was used earlier by Heggernes et al.  to show the fixed-parameter tractability of computing the cochromatic number of perfect graphs and the stabbing number of disjoint rectangles with axes-parallel lines (using the standard parameters). Iterative localization was also applied in follow-up work related to the cochromatic number .
A crucial ingredient in applying the pushing process is to understand the avalanches caused by this process. For -Recognition, an avalanche is triggered when a vertex is pushed to ; this may imply that several other vertices must be pushed to , which, in turn, triggers the pushing of yet more vertices to , and so on. Similar effects are visible in the aforementioned cochromatic number and rectangle stabbing number problems [21, 24]. The contribution of the previous works [21, 22, 24] was to bound the depth of this process by some function of the parameter, leading to fixed-parameter algorithms. However, such a bound does not provide an answer to the question of which vertices trigger avalanches and their continued rolling, and whether the number of such vertices can somehow be limited.
This question can be naturally formalized in terms of the kernelization complexity of problems to which the pushing process applies. A kernel reduces the size of the graph and thus directly reduces the number of vertices triggering or being affected by avalanches when an algorithm based on the pushing process is applied to the kernelized instance. In previous work, Kolay et al.  studied the kernelization complexity of computing the cochromatic number of a perfect graph , which is the smallest number such that can be partitioned into sets that each induces a clique and sets that each induces an edgeless graph. This problem has a parameterized algorithm using iterative localization (i.e., a pushing process) , but Kolay et al.  showed that, unless , this problem does not admit a polynomial kernel parameterized by . This suggests that, for this problem, one cannot control the number of vertices affected by avalanches. The kernelization complexity of -Recognition, however, has not been studied so far. Hence, it is open whether avalanches can be controlled to affect few vertices in this case.
We study the kernelization complexity of -Recognition through the lens of the pushing process. To this end, we consider the first level above triviality of the problem. When is characterized by a forbidden induced subgraph of order , then -Recognition can be solved in linear time , and thus we focus on the NP-hard case when the forbidden induced subgraph has order [2, 14, 25]. In particular, we let be the class of so-called cluster graphs. These are the graphs that contain no —the (simple) path on three vertices—as an induced subgraph, or equivalently, graphs that are disjoint unions of complete graphs. This leads to the following problem:
Input: A graph .
Question: Can be partitioned into such that is a cluster graph and ?
Cluster--Partition generalizes the recognition problem of many graph classes, such as the recognition of monopolar graphs [6, 8, 9, 26] ( is the set of edgeless graphs), -subcolorable graphs [5, 16, 20, 29] ( is the set of cluster graphs), and several others [1, 4, 7]. Unfortunately, Cluster--Partition is NP-hard in these special cases, and in general when is characterized by a set of connected forbidden induced subgraphs [2, 14, 25]. Hence, we consider the number of clusters in the cluster graph as a parameter, and study the pushing process with respect to this parameter.
Our result gives a complete characterization of the kernelization complexity of Cluster--Partition through a deeper understanding of the pushing process. We show that, while for a specific the pushing process can be used to witness a small vertex set of size containing the vertices affected by avalanches, for all other , such a set of polynomial size is unlikely to exist. Formally, we show the following.
Let be a graph property characterized by a (not necessarily finite) set of connected forbidden induced subgraphs. Then unless , Cluster--Partition parameterized by the number of clusters in the cluster graph admits a polynomial kernel if and only if contains a graph of order at most .
The positive result corresponds to the recognition of monopolar graphs. Indeed, the graph properties with forbidden induced subgraphs of order are “being edgeless” and “being nonedge-less”, but the latter is not characterized by connected forbidden induced subgraphs.
The pushing process and a deeper understanding of the avalanches it causes are indeed central to both directions of the above result. In the proof of the positive result, we first perform a set of data reduction rules to identify some vertices that are part of or in any partition of such that is a cluster graph with at most clusters and is edgeless. More importantly, these rules restrict the combinatorial properties of the graph induced by the remaining vertices. With these restrictions, it becomes possible to model the avalanches that occur using a bipartite graph. This graph enables two further reduction rules that lead to the polynomial kernel.
For the negative result, we observe that the bipartite graph constructed in the kernel is closely tied to the deterministic behavior of the pushing process for monopolar graphs: when an edge in is created by pushing a vertex to , the other endpoint of the edge must be pushed to (recall that must become edgeless). This limits the avalanches. However, for more complex properties , such a simple correspondence no longer exists. In particular, when the forbidden induced subgraphs have order at least , pushing a vertex to may create a forbidden induced subgraph in that can be repaired in at least two different ways. Then the pushing process starts to behave nondeterministically, and the avalanches grow beyond control. We exploit this intuition to exclude the existence of a polynomial kernel, unless , by providing a cross-composition.
One might consider two other parameters: the size of a largest cluster in and the size of one of the sides. The size of a largest cluster in will not lead to tractability, as Cluster--Partition is NP-hard on subcubic graphs, even when is the set of edgeless graphs . Thus, we consider the number of vertices in the graph , even for the broader -Recognition problem. We previously proved a general fixed-parameter tractability result in this case . We observe a very general kernelization result:
-Recognition has a kernel of size parameterized by , the maximum size of , when can be characterized by a collection of forbidden induced subgraphs, each of size at most , and is hereditary.
We obtain a better bound in terms of the number of vertices for Cluster--Partition, the restriction of Cluster--Partition to the case when all graphs containing a vertex of degree at least are forbidden induced subgraphs of .
Cluster--Partition parameterized by , the maximum size of , has an -vertex kernel.
We follow standard graph-theoretic notation . Let be a graph. By and we denote the vertex-set and the edge-set of , respectively. Throughout the paper, we use to denote the number of vertices in and to denote its number of edges. We also say that is of order . We assume since isolated vertices can be safely removed in the problems that we consider. For , denotes the subgraph of induced by . For a vertex , and denote the open neighborhood and the closed neighborhood of , respectively. For , we define and , and for a family of subsets , we define and .
We say that a vertex is adjacent to a subset of vertices if is adjacent to at least one vertex in . Similarly, we say that two vertex sets and are adjacent if there exist and that are adjacent. If is any set of vertices in , we write for . For a vertex , we write for .
We say a partition of is a cluster- partition if (1) is a cluster graph and (2) . A monopolar partition of a graph is a partition of into a cluster graph and an independent set.
Input: A graph and an integer .
Question: Does admit a monopolar partition such that the number of clusters in the cluster graph is at most ?
For an instance of Monopolar Recognition, a monopolar partition of is valid if the number of clusters in the cluster graph of the partition is at most . For , we use to denote .
A parameterized problem is a tuple , where is a language over some finite alphabet and is a parameterization. For a given instance , we also say is the parameter. A parameterized problem is fixed parameter tractable (FPT), if there exists an algorithm that on input decides if is a yes-instance of , that is, , and that runs in time , where is a computable function independent of . We will denote by fpt-time a running time of the form . A parameterized problem is kernelizable if there exists a polynomial-time reduction that maps an instance of the problem to another instance such that: (1) for some computable function , (2) , and (3) is a yes-instance of the problem if and only if is. The instance is called the kernel of .
Let be a language and a parameterized problem, i.e., is a language and a parameterization. An or-cross-composition from into is a polynomial-time algorithm that, given instances of , computes an instance such that
and if and only if for some . We have the following:
Let be an NP-hard language and be a parameterized problem. If there is an or-cross-composition from into and admits a polynomial-size problem kernel, then .
3 A Polynomial Kernel for Monopolar Recognition Parameterized by the Number of Clusters
The outline of the kernelization algorithm is as follows. First, we compute a decomposition of the input graph into sets of vertex-disjoint maximal cliques which we call a clique decomposition. This decomposition is used and updated throughout the data-reduction procedure. We also maintain sets of vertices that are determined to belong to or . We first apply a sequence of reduction rules whose aim is roughly to bound the number of cliques and the number of edges between the cliques in the decomposition, and to restrict the structure of edges between cliques. Then, we build an auxiliary graph to model how the placement of a vertex in or implies an avalanche of placements of vertices in and . If this avalanche creates too many clusters in , then this determines the placement of certain vertices in or , and triggers another reduction rule. If this reduction rule does not apply anymore, then the size of the auxiliary graph is bounded, which in turn helps bounding the size of the instance.
3.1 Clique Decompositions
Say that a clique is a large clique if , an edge clique if (i.e., is an edge), and a vertex clique if (i.e., consists of a single vertex). Let be an instance of Monopolar Recognition. Suppose that and are subsets of vertices that have been determined to be in and , respectively, in any valid monopolar partition of . We define a decomposition of , referred to as a nice clique decomposition, that partitions this set into vertex-disjoint cliques , , such that the tuple satisfies the following properties:
In the decomposition tuple , the large cliques appear before the edge cliques, and the edge cliques, in turn, appear before the vertex cliques; that is, for each large clique and for each edge or vertex clique we have , and for each edge clique and for each vertex clique we have .
Each clique , , is maximal in ; that is, there does not exist a vertex such that is a clique.
The subgraph of induced by the union of the edge cliques and vertex cliques does not contain any large clique.
The following fact is implied by property (ii) above:
The vertex cliques in a nice clique decomposition form an independent set in .
A nice clique decomposition of can be computed as follows. Let . We check whether contains a clique of size at least three. If this is the case, then we find a maximal clique in , add as a large clique to the decomposition, set and repeat. Otherwise, does not contain any clique of size 3, we check whether contains an edge clique (i.e., two endpoints of an edge), add to the decomposition, set and repeat. If no edge clique exists in , then the remaining vertices in form an independent set, and we add each one of them to the decomposition as a vertex clique. This process can be seen to run in polynomial time, but we will use the following more precise bound.
A nice clique decomposition of can be computed in time.
First, in time, compute a list of all triangles in . Then, label all vertices as free. Let denote the graph . Process the list from head to tail; that is, consider each triangle in the list. If one vertex of the triangle is not labeled as free, then continue with the next triangle. If all vertices in this triangle are labeled as free, then compute a maximal clique in containing this triangle. This can be done in time . Add the maximal clique to the decomposition as described above, remove all vertices of the maximal clique from , and unlabel all vertices of the maximal clique. Overall this step takes time, since we encounter at most triangles whose vertices are labeled free. Once all triangles in the list are processed, compute a set of edge cliques in time by computing a maximal matching in . Finally, add all remaining vertices as vertex cliques in time. ∎
Let be an instance of Monopolar Recognition. We initialize , , and we compute a nice clique decomposition of . We will then apply reduction rules to simplify the instance . During this process, we may identify vertices in to be added to or . At any point in the process, we will maintain a partition of such that (1) and for any valid monopolar partition of , and (2) is a nice clique decomposition of . We call such a partition a normalized partition of .
3.2 Basic Reduction Rules
We now describe our basic set of reduction rules. After the application of a reduction rule, a normalized partition may change as the result of moving vertices from to , and we will need to compute a nice clique decomposition of the resulting (new) set . However, a vertex that has been moved to (resp. ) will remain in (resp. ). When a reduction rule is applied, we assume that no reduction rule preceding it, with respect to the order in which the rules are listed, is applicable.
The following rule is straightforward:
Reduction Rule 3.3.
Let be a normalized partition of . If is not a cluster graph with at most clusters, or is not an independent set, then reject the instance .
The following rule is correct because, for every monopolar partition of , and is an independent set.
Reduction Rule 3.4.
Let be a normalized partition of . If there is a vertex that is adjacent to then set .
The following rule is correct, since for every monopolar partition of :
Reduction Rule 3.5.
Let be a normalized partition of . If there is a vertex that is either (1) adjacent to two clusters in , or (2) adjacent to a cluster in but not to all the vertices , then set .
The proof of the following reduction rule is straightforward, after recalling that the vertex cliques induce an independent set in (Fact 3.1), and observing that no two vertices of an independent set can belong to the same cluster in a cluster graph:
Reduction Rule 3.6.
Let be a normalized partition of . If there is a vertex with more than neighbors that are vertex cliques, then set .
The next two reduction rules restrict the number and type of edges incident to large cliques.
Reduction Rule 3.7.
Let be a normalized partition of . If there exists a vertex and a large clique such that , then set .
Since , has at least two neighbors and at least one nonneighbor . If a vertex is in , for any valid monopolar partition of , then since is an independent set, it follows that . In particular, is in , at least one of , say , is in , and is in . But this implies that forms an induced in , contradicting that is a cluster graph. ∎
Reduction Rule 3.8.
Let be a normalized partition of , and let , , be two cliques such that is a large clique and is either a large clique or an edge clique. If there are at least two edges between and then one of the following reductions, considered in the listed order, is applicable:
There are two edges and , where and , such that and . Let be such that (note that exists because ). Set .
. Set .
We first prove that either case (1) or case (2) applies. Suppose that case (1) does not apply, and we show that case (2) does.
By maximality of in (property (ii) in the definition of a nice clique decomposition), no vertex in can be adjacent to all vertices in . It follows from this fact and from the inapplicability of Reduction Rule 3.7 that each vertex in has at most one neighbor in . Since case (1) does not apply, the vertices in that have a neighbor in must all have the same neighbor, which proves that case (2) applies.
Now suppose that case (1) applies, and we will show the correctness of the reduction rule in this case. Let be any valid monopolar partition of . Since at most one of can be in , at least one of , say , is in . Suppose, to get a contradiction, that . Then both and must be in . By maximality of in , cannot be adjacent to all vertices in . Since Reduction Rule 3.7 is not applicable, must be the only neighbor of in . But then is an induced in , contradicting that is a cluster graph.
Suppose that case (2) applies, and suppose to get a contradiction that in some valid monopolar partition of . Since there are at least two edges between and , has at least two neighbors . Again, observe that at least one of , say , must be in . Since , at least one vertex in , say , must be in . Since is the only neighbor of in by the premise of case (2), it follows that is an induced in , contradicting that is a cluster graph. ∎
We can now bound the number of large cliques and edge cliques in yes-instances.
Reduction Rule 3.9.
Let be an instance of Monopolar Recognition, and let be a normalized partition of . If in either the number of large cliques is more than , or the number of large cliques plus the number of edge cliques is more than , then reject the instance .
Let be any monopolar partition of . Since a large clique has size at least 3, at least vertices from must belong to the same cluster in . By Reduction Rule 3.8, the number of edges between any large clique and any other large or edge clique is at most 1. It follows from the aforementioned statements that two vertices from two different large cliques, or from a large clique and an edge clique, must belong to different clusters in . Consequently, if the number of large cliques in is more than , then for any monopolar partition of , the number of clusters in is more than , and hence is a no-instance of Monopolar Recognition.
Suppose now that the number of large cliques in is , and that the number of edge cliques is . From above, for any monopolar partition , no vertex from an edge clique can belong to a cluster in containing a vertex from a large clique. Let and , , be any two edge cliques. Since is an independent set, at least one vertex from each edge clique must be in . By property (iii) of a nice decomposition, no cluster in can contain three vertices from three different edge cliques in . It follows from the aforementioned two statements that the number of clusters in that contain vertices from edge cliques in is at least . Now the set of clusters in containing vertices from large cliques is disjoint from that containing vertices from edge cliques, and hence the number of clusters in is at least . If the number of large cliques plus the number of edge cliques is more than , then , and hence . This means that for any monopolar partition of , the number of clusters in is more than . It follows that is a no-instance of Monopolar Recognition. ∎
Next, we sanitize the connections between already determined clusters in and the remaining cliques in the normalized partition.
Reduction Rule 3.10.
Let be a normalized partition of , let be a cluster in , and let , , be a large clique. If is such that: (1) is the only vertex in that is adjacent to , or (2) is the only vertex in that is not adjacent to , then set .
To prove the correctness of the reduction rule in case (1) holds, suppose that is the only vertex in that is adjacent to . Let be any monopolar partition of . Let be any vertex in that is adjacent to . Since is a large clique, there exists a vertex , with , such that . Since is the only vertex in that is adjacent to , is not adjacent to . Now if were in , then since and hence , would be an induced in , contradicting that is a cluster graph. It follows that for any monopolar partition of .
To prove the correctness of the reduction rule in case (2) holds, suppose that is the only vertex in that is not adjacent to . Let be any monopolar partition of . Since is a large clique, there exists a vertex , with , such that . Since is the only vertex in that is not adjacent to , is adjacent to some vertex . Now if were in , then since and hence , would be an induced in , contradicting that is a cluster graph. It follows that for any monopolar partition of . ∎
Suppose that none of the above reduction rules applies to the instance . Then, the following lemma holds:
Let be a normalized partition of , let be a cluster in , and let , , be a large clique such that is adjacent to . If admits a monopolar partition, then induces a clique in .
Suppose, to get a contradiction, that does not induce a clique, and hence, there exists a vertex such that is not adjacent to some vertex in . Since and are adjacent, there exist vertices and such that and are adjacent. Since Reduction Rule 3.5 is not applicable, is not adjacent to any vertex in , and is adjacent to every vertex in . Since cases (1) and (2) of Reduction Rule 3.10 are not applicable, there exist vertices and in such that is adjacent to and is not adjacent to . Since Reduction Rule 3.5 is not applicable, is adjacent to every vertex in . Now for any monopolar partition of , since is an independent set, at least one vertex is in , and at least one vertex of is in . But then is an induced in , contradicting that is a cluster graph. ∎
The above structure allows us to simplify the instance by shrinking already determined clusters in .
Reduction Rule 3.12.
Let be an instance of Monopolar Recognition, and let the tuple be a normalized partition of . If either (1) contains more than vertices or (2) there exists a cluster in that is not a singleton, then reduce the instance to an instance with constructed as follows. Let , where , , and ; and . That is, is constructed from by introducing new vertices, replacing each cluster in (if any) by a single vertex whose neighborhood is the neighborhood of in plus the new vertices, and keeping the same.
To prove the correctness of the reduction rule, we need to show that is a yes-instance of Monopolar Recognition if and only if is. First, observe that by Reduction Rule 3.4, no vertex in is adjacent to any vertex in .
If , then the reduction rule consists of removing the vertices in from , and replacing them with isolated vertices . Since and no vertex in is adjacent to any vertex in , the vertices in are isolated vertices in . Therefore, the reduction rule in this case consists of replacing the isolated vertices in with isolated vertices that can be safely added to , for any valid monopolar partition of . Hence, the reduction rule is obviously correct in this case.
Assume now that . It is easy to see that if is a yes-instance of Monopolar Recognition then so is . This can be seen as follows. If is a valid monopolar partition of , then the above reduction rules guarantee that , and hence each cluster of must be a subset of a single cluster in . If we (i) remove the vertices in and add vertices to that induce an independent set, and (ii) replace each cluster in by a single vertex connected to the new vertices in and to the vertices of the cluster that belongs to , we still get a valid monopolar partition of .
To prove the converse, suppose that is a yes-instance of Monopolar Recognition, and let be a valid monopolar partition of . Since is a valid monopolar partition of , and every vertex , is a cluster in , is adjacent to the independent vertices , we have for every cluster in , and . Let . Since (1) induces an independent set, (2) every vertex , is a cluster in , is in , and (3) no vertex in is adjacent to any vertex in , it follows that is an independent set. Let be the set of vertices obtained from by replacing each vertex by the vertices in the cluster in . We claim that is a cluster graph with at most clusters. Suppose that a vertex is replaced in by the vertices in cluster in ; assume that belongs to cluster in . Each vertex in , other than , must be a vertex in . Let be chosen arbitrarily. Since and belong to the same cluster , by definition of , must be adjacent to in . By Reduction Rule 3.5, must be adjacent to all vertices in . Since was an arbitrarily chosen vertex in , induces a cluster in . It remains to show that no two clusters in are adjacent. Suppose, to get a contradiction, that this is not the case. Since induces a cluster graph, there must exist two vertices and in , that belong to clusters and in , respectively, such that cluster is adjacent to cluster . Since is a cluster graph, this implies that either: (1) is adjacent to , (2) is adjacent to , or (3) is adjacent to . This leads to a contradiction in each of the three cases above: (1) would contradict that is a cluster graph (Reduction Rule 3.3), (2) would imply that , and hence , is adjacent to in , and (3) would imply that , and hence , is adjacent to in . It follows from the above that the constructed partition is a valid monopolar partition for . Finally, the number of clusters in is the same as that in , which is at most . ∎
If Reduction Rule 3.12 is applied, then after its application, we set to and to . Note that in any valid monopolar partition of the graph resulting from the application of Reduction Rule 3.12, each vertex in must be in , being adjacent to the independent set vertices , whereas the vertices can be safely assumed to be in since their only neighbors are in .
3.3 Modelling the Pushing Process by a Bipartite Graph
We have now arrived at a stage where we have bounded the number of large and edge cliques, and the size of and . It remains to bound the size of the large cliques and the number of vertex cliques to obtain a polynomial-size problem kernel. The challenge here is that we need to identify vertices such that putting them in or will eventually, after a series of pushes, lead either to the creation of too many clusters in , or to the addition of two adjacent vertices in . To model the avalanche of pushes to or , we introduce the following auxiliary graph.
Definition 3.13.11todo: 1ms: Was macht das rm hier?
For a normalized partition of , we define the auxiliary bipartite graph as follows. The vertex set of is , where is the set of all vertices in the large cliques in , and is the set of all vertices in the vertex cliques in . The edge set of is ; that is, consists of precisely the edges in that are between and .
Recall that is an independent set in by Fact 3.1. For a vertex , we write for the neighbors of in . We have the following lemma:
Let be a normalized partition of and consider the auxiliary graph . Then the maximum degree of , , is at most .
For every vertex , we have because Reduction Rule 3.6 is inapplicable. By property (ii) of a nice decomposition and the inapplicability of Reduction Rule 3.7, every vertex clique that is adjacent to a large clique is adjacent to exactly one vertex in . Since by Reduction Rule 3.9 the number of large cliques is at most , every vertex in , which is a vertex clique by definition of , has at most neighbors in . Therefore, for every vertex , we have . ∎
Using the following lemma, we now observe that the auxiliary graph captures some of the avalanches emanating from vertices in large or vertex cliques. Namely, pushing a vertex in a large clique to (or in a vertex clique to ) will also require pushing each vertex reachable (in the auxiliary graph) from from to or vice versa.
For two vertices , write for the length of a shortest path between and in . For a vertex and , define . Write for the set of even integers in , and
for the set of odd integers in.
Let be a normalized partition of , let be the associated auxiliary graph where , and let be any valid monopolar partition of .
For any vertex : If then .
For any vertex : If then .
For any vertex : If then for , and for .
For any vertex : If then for , and for .
(i): This trivially follows because is an independent set.
(ii): Suppose that is in , and let . Then because is bipartite, and hence, by definition, belongs to a large clique for some . Suppose, to get a contradiction, that . Since is a large clique, and hence , there exists a vertex in such that . By property (ii) of the nice decomposition and the inapplicability of Reduction Rule 3.7, is not a neighbor of in . But this implies that is an induced in , contradicting that is a cluster graph. It follows that .
(iii): This follows by repeated alternating applications of (i) and (ii) above.
(iv): This follows by repeated alternating applications of (ii) and (i) above. ∎
The above lemma about the avalanches captured by the auxiliary graph allows us to identify vertices whose push to one side of the partition would lead to avalanches that, in turn, would lead to too many clusters in or to two adjacent vertices in . We can hence fix them in the corresponding part.
Reduction Rule 3.16.
Let be a normalized partition of , and let be the associated auxiliary graph.
For any vertex