Tree decompositions and path decompositions are fundamental objects in graph theory. For algorithmic purposes, it would be highly useful to be able to compute such decompositions of minimum width, that is, achieving the treewidth and the pathwidth of the graph, respectively. However, both these problems are NP-hard, and remain so even when restricted to very specific graph classes [1, 3, 15, 16, 18, 19, 20].
For treewidth, a relatively good approximation can be found in polynomial time thanks to a classic algorithm of Feige, Hajiaghayi, and Lee . Their algorithm computes a tree decomposition of an input graph whose width is within a ratio of of the treewidth of . For pathwidth, however, not much seems to be known. The treewidth algorithm of Feige et al. can be modified to output a path decomposition with width within a ratio of of the pathwidth of , using the fact that when has vertices . This seems to be the best known approximation algorithm for pathwidth.
In this paper we describe a polynomial-time algorithm approximating pathwidth to within a ratio of , thus replacing the factor in the previous approximation ratio with a factor. To our knowledge, this is the first approximation algorithm achieving an approximation factor of or for some function with no dependence on .
Our approach builds on the following key insight: every graph with large pathwidth has large treewidth or contains a subdivision of a large complete binary tree.
Every graph with treewidth has pathwidth at most or contains a subdivision of a complete binary tree of height .
The bound is best possible up to a multiplicative constant (see Section 5). We note that Theorem 1.1 was originally motivated by the following result of Kawarabayashi and Rossman  about treedepth, which is an upper bound on pathwidth: every graph with treedepth has treewidth at least , or contains a subdivision of a complete binary tree of height , or contains a path of order . The bound was recently lowered to by Czerwiński, Nadara, and Pilipczuk , who also devised an -approximation algorithm for treedepth. Kawarabayashi and Rossman  conjectured that the third outcome of their theorem, the path of order , could be avoided if one considered pathwidth instead of treedepth: they conjectured the existence of a universal constant such that every graph with pathwidth has treewidth at least or contains a subdivision of a complete binary tree of height . Theorem 1.1 implies their conjecture with , which is best possible (see Section 5). Both Theorem 1.1 and the treedepth results [17, 10] are a continuation of a line of research on excluded minor characterizations of graphs with small values of their corresponding width parameters (pathwidth/treedepth/treewidth), which was started by the seminal Grid Minor Theorem  and its improved polynomial versions [8, 9].
Every subdivision of a complete binary tree of height has pathwidth . Hence, such a subgraph can be used to certify that the pathwidth of a given graph is large. The following key concept provides a stronger certificate of large pathwidth, more suitable for our purposes. Let be a sequence of classes of graphs defined inductively as follows: is the class of all connected graphs, and is the class of connected graphs that contain three pairwise disjoint sets of vertices , , and such that and any two of , , and can be connected in by a path avoiding the third one. Every graph in has the following properties:
Theorem 1.1 has a short and simple proof (see Section 3). It proceeds by showing that every graph with treewidth has pathwidth at most or belongs to . The stronger assertion allows us to apply induction on . Unfortunately, this proof is not algorithmic.
For every connected graph with treewidth at most , there is an integer such that and has pathwidth at most . Moreover, there is a polynomial-time algorithm to compute such an integer , a path decomposition of of width at most , and a subdivision of a complete binary tree of height in given a tree decomposition of of width at most .
Since every graph in has pathwidth at least , combining Theorem 1.2 with the aforementioned approximation algorithm for treewidth of Feige et al. , we obtain the following approximation algorithm for pathwidth.
There is a polynomial-time algorithm which, given a graph of treewidth and pathwidth , computes a path decomposition of of width . Moreover, if a tree decomposition of of width is also given in the input, the resulting path decomposition has width at most .
We remark that if we consider graphs coming from a fixed class of graphs with bounded treewidth, then we can first use an algorithm of Bodlaender  to compute an optimal tree decomposition of in linear time, and then use the above algorithm to approximate pathwidth to within a ratio of roughly . We note the following two precursors of this result in the literature (with slightly better approximation ratios): Bodlaender and Fomin  gave a -approximation algorithm for computing the pathwidth of outerplanar graphs (a subclass of graphs of treewidth at most ), and Fomin and Thilikos  gave a -approximation algorithm for computing the pathwidth on Halin graphs (a subclass of graphs of treewidth at most ).
We conclude this introduction with a remark about parameterized algorithms, even though our focus in this paper is approximation algorithms with running time polynomial in the size of the input graph. Bodlaender  (see also ) designed a linear-time FPT algorithm for computing pathwidth when parameterized by the pathwidth. That is, for an -vertex input graph , his algorithm computes the pathwidth and an optimal path decomposition of in time for some computable function . Bodlaender and Kloks  considered the problem of computing the pathwidth when the input graph has small treewidth. They devised an XP algorithm for computing pathwidth when parameterized by the treewidth: given an -vertex graph , the algorithm computes and an optimal path decomposition of in time for some computable function . It is an old open problem whether pathwidth is fixed-parameter tractable when parameterized by the treewidth, that is, whether there exists an algorithm to compute the pathwidth of an -vertex input graph in time for some computable function . This question was first raised by Dean . Fomin and Thilikos  pointed out that even obtaining an approximation of pathwidth when parameterized by treewidth is open. Our approximation algorithm is a solution to the latter problem (in a strong sense—with polynomial dependence of the running time in the parameter) and can be seen as a step in the direction of Dean’s question.
2 Preliminaries and Tools
2.1 Basic Definitions
Graphs considered in this paper are finite, simple, and undirected. We use standard graph-theoretic terminology and notation. We allow a graph to have no vertices; by convention, such a graph is not connected and has no connected components. The vertex set of a graph is denoted by . A vertex of a graph is considered a neighbor of a set if and is connected by an edge to some vertex in . The neighborhood (thus defined) of a set in is denoted by .
A tree decomposition of a graph is a pair where is a tree and is a family of subsets of called bags, satisfying the following two conditions:
for each vertex of , the set of nodes induces a non-empty subtree of ;
for each edge of , there is a node in such that both and belong to .
The width of a tree decomposition is . The treewidth of a graph is the minimum width of a tree decomposition of . A path decomposition and pathwidth are defined analogously with the extra requirement that the tree is a path. The treewidth and the pathwidth of a graph are denoted by and , respectively. We refer the reader to  for background on tree decompositions.
A complete binary tree of height is a rooted tree in which every non-leaf node has two children and every path from the root to a leaf has edges. Such a tree has nodes. A complete ternary tree of height is defined analogously but with the requirement that every non-leaf node has three children. A subdivision of a tree is a tree obtained from by replacing each edge with some path connecting and whose internal nodes are new nodes private to that path.
2.2 Witnesses for Large Pathwidth
Recall that is the sequence of classes of graphs defined inductively as follows: is the class of all connected graphs, and is the class of connected graphs that contain three pairwise disjoint sets of vertices , , and such that and any two of , , and can be connected in by a path avoiding the third one.
A -witness for a graph is a complete ternary tree of height of subsets of defined inductively following the definition of . The -witness for a connected graph is the tree with the single node , denoted by . A -witness for a graph is a tree with root and with three subtrees of the root that are -witnesses of for some sets as in the definition of ; it is denoted by .
It clearly follows from these definitions that every graph in has at least vertices and every -witness of an -vertex graph has nodes. The next two lemmas explain the connection of to pathwidth and to subdivisions of complete binary trees.
If , then .
The proof goes by induction on . The case is trivial. Now, assume that and the lemma holds for . Since , there are sets interconnected as in the definition of , such that and thus for . Let be a path decomposition of . With bags restricted to , it becomes a path decomposition of . It follows that for , there is a bag in such that . Assume without loss of generality that occur in this order in . Since and are connected, there is a path that connects and in avoiding . This path must have a vertex in , so and thus . This proves that . ∎
The proof of Lemma 2.1 generalizes the well-known proof of the fact that (a subdivision of) a complete binary tree of height has pathwidth at least . Actually, it is straightforward to show that such a tree belongs to .
If , then contains a subdivision of a complete binary tree of height as a subgraph. Moreover, it can be computed in polynomial time from a -witness for .
We prove, by induction on , that every graph contains the following structure: a subdivision of a complete binary tree of height with some root and a path from to such that . This is trivial for . For the induction step, assume that and the statement holds for . Let and . Let be as in the definition of . Assume without loss of generality that or can be connected with by a path in avoiding . For , since is connected and has a path connecting with and avoiding , there is also a path in from to some vertex avoiding . These paths can be chosen so that they first follow a common path from to some vertex in and then they split into a path from to and a path from to so that is the only common vertex of any two of . For , the induction hypothesis provides an appropriate structure in : a subdivision of a complete binary tree of height with root and a path from to such that . Connecting with and by the combined paths and , respectively, yields a subdivision of a complete binary tree of height with root in . The construction guarantees that .
Clearly, given a -witness for , the induction step described above can be performed in polynomial time, and therefore the full recursive procedure of computing a subdivision of a complete binary tree of height in works in polynomial time. ∎
2.3 Combining Path Decompositions
The following lemma will be used several times in the paper to combine path decompositions.
Let be a graph and be a tree decomposition of of width .
If and every connected component of has pathwidth at most , then there is a path decomposition of of width at most which contains in every bag.
If is the path connecting and in and every connected component of has pathwidth at most , then there is a path decomposition of of width at most which contains in the first bag and in the last bag.
In either case, there is a polynomial-time algorithm to construct such a path decomposition of from the path decompositions of the respective components of width at most .
In case 1, the path decomposition of is obtained by concatenating the path decompositions of the connected components of (which have width at most ) and adding to every bag. Now, consider case 2. For every node of , let be the subtree of induced on the nodes such that the path from to in contains no other nodes of , and let . Apply case 1 to the graph , the tree decomposition of , and the node to obtain a path decomposition of of width at most containing in every bag. Then, concatenate the path decompositions thus obtained for all nodes of (in the order they occur on ) to obtain a requested path decomposition of . ∎
3 Proof of Theorem 1.1
For every , every connected graph with treewidth at most has pathwidth at most or belongs to .
The proof goes by induction on . The statement is true for : if a connected graph has a cycle or a vertex of degree at least , then , and otherwise is a path, so . For the rest of the proof, assume that and the statement is true for .
Let be a connected graph and be a tree decomposition of of width at most . Thus for every node of . Every edge of splits into two subtrees: containing and containing . For every oriented edge of , let be the set of vertices of contained only in bags of (i.e., vertices belonging to no bag of ), and let . As long as has and edge such that is disconnected, we modify by replicating the subtree into one distinct copy (connected to ) for each connected component of with bags restricted to . This makes satisfy the following:
If has a node such that for every neighbor of in , then the induction hypothesis gives for every such , and Lemma 2.3 1 applied with yields . For the rest of the proof, assume that no such node exists.
Let be the set of oriented edges of such that . By the assumption above, at least one oriented edge going out of every node of belongs to . Therefore, , so has at least one edge such that . Since supergraphs of graphs in also belong to , implies for every neighbor of other than . Therefore, the edges of with form a subtree in , while the other edges in “point towards ” in . Let be the set of leaves of . Since has at least one edge, .
Suppose . Choose any distinct . For , let , where is the unique neighbor of in . It follows that and, by the property (), the subgraphs , , and are connected, which implies that any two of , , and can be connected by a path avoiding the third one. Thus .
4 Proof of Theorem 1.2
By Lemma 2.2, every graph in contains a subdivision of a complete binary tree of height , which can be computed in polynomial time from a -witness of . Therefore, Theorem 1.2 is a direct corollary to the following statement, the proof of which is presented in this section.
There is a polynomial-time algorithm which, given a connected graph and a tree decomposition of of width at most , computes
a number such that and ,
a -witness for ,
a path decomposition of of width at most .
4.1 Normalized Tree Decompositions
Let be a graph with a fixed rooted tree decomposition of width at most , which we call the initial tree decomposition of . For a node of , let be the subtree of consisting of and all nodes of lying below , and let be the set of vertices of contained only in bags of (i.e., in no bags of ). We show how to use to define a normalized tree decomposition of , which will have some additional desired properties.
For an induced subgraph of , let denote the family of connected components of (which is empty when is the empty graph). Let
For a connected graph , the set will be the node set of the normalized tree decomposition of , which we are to define shortly. We will use Greek letters , , etc. to denote members of (nodes of the normalized tree decomposition of ). Here are some easy consequences of these definitions and the assumption that is a tree decomposition of .
The following holds for every graph and for defined as above.
If is connected, then .
If , then , or , or .
If and , then no edge of connects and .
Now, assume that is a connected graph. By Lemma 4.2, the members of are organized in a tree with root and with the following properties for any :
if is a descendant of (i.e., lies on the path from the root to ), then ;
if neither of and is a descendant of the other, then and are disjoint and non-adjacent in .
For each , let be the set of vertices for which is the minimum member of such that , and let . Recall that .
The following holds for every connected graph .
is a partition of into non-empty sets.
is a tree decomposition of .
for every .
It is clear from the definition that . Let . Since is connected and the vertex sets of the children of in are pairwise disjoint and non-adjacent, at least one vertex of is not a vertex of any child of and thus belongs to . This shows 1.
For the proof of 2, let and . Then . If for some other , then must be a neighbor of in , which implies that is a descendant of in . In that case, is also a neighbor of (and thus ) for every internal node on the path from to in . This shows that the nodes of such that form a non-empty connected subtree of . It remains to show that any two adjacent vertices and of belong to a common bag . Let be a minimal member of containing at least one of and ; say, it contains . Then . If , then , by minimality of , so is a neighbor of in . In either case, .
For the proof of 3, let , and let be the lowest node of such that . Every vertex belongs to ; otherwise it would belong to for some child of in , and the connected component of containing would be a proper induced subgraph of , contradicting . Now, let be a neighbor of in . Thus is a neighbor of some vertex while . It follows that while , which implies that belongs to some bag of as well as some bag of (as is an edge of ), so it belongs to . This shows that , so . ∎
We call the normalized tree decomposition of . By Lemma 4.3, it has at most nodes and its width is at most . It depends on the choice of the initial tree decomposition for . In the algorithm, the graphs considered above are induced subgraphs of a common input graph given with an input tree decomposition , and the initial tree decomposition of considered above in the definition of is the input tree decomposition of with appropriately restricted bags (some of which may become empty):
The fact that the normalized tree decompositions for all induced subgraphs of considered in the algorithm come from a common input tree decomposition of has the following easy consequences, which will be used in the complexity analysis of the algorithm in Subsection 4.3.
The following holds for every induced subgraph of the input graph .
If and , then .
If , then , or , or .
If , then .
If , , and , then .
Let and be defined like and but for the tree decomposition of . For every induced subgraph of , we have
If and , then the fact that is a connected component of for some node of implies that it is also a connected component of , which in turn implies . This shows 1.
If and , then property 1 with yields . This implies 2 by Lemma 4.2 2, and it also implies the “” inclusion in 3 (with and interchanged). For the converse inclusion in 3, let and . Thus . Let and be a node of such that is a connected component of . Since , there is a node in (implying ) such that is a connected component of . It follows that is a connected component of , so .
Finally, if , , and , then is a connected component of for some node of , whence it follows that is a connected component of and thus . This shows 4. ∎
4.2 Main Procedure
The core of the algorithm is a recursive procedure , where is a connected graph with and is an upper bound request. It computes the following data:
a number such that ,
a -witness of ,
only when : a path decomposition of of width at most .
The algorithm uses memoization to compute these data only once for each pair . A run of produces the outcome requested in Theorem 4.1.
The purpose of the upper bound request is optimization—we tell the procedure that if it can provide a -witness for , then we no longer need any path decomposition for . This allows the procedure to save some computation, perhaps preventing many unnecessary recursive calls. Our complexity analysis of the algorithm will heavily rely on this optimization.
Below, we present the procedure for a fixed connected graph and a fixed upper bound request . The procedure assumes access to the normalized tree decomposition of of width at most , obtained from some initial tree decomposition of in the way described in Subsection 4.1. In the next subsection, we show that if the normalized tree decompositions for all graphs considered in the algorithm come from a common input tree decomposition of an input graph as described in Subsection 4.1, then a full run of makes recursive calls to for only polynomially many distinct pairs .
If , then we just set and . Assume henceforth that .
Suppose has only one node, that is, . Since is the bag of that node, . If has a cycle or a vertex of degree at least , then it has three vertices such that any two of them can be connected by a path in avoiding the third one. In that case, we set and , and (if ) we let be the path decomposition of consisting of the single bag , which has width at most . If has no cycle or vertex of degree at least , then it is a path. In that case, we set and , and we let be any path decomposition of of width .
Assume henceforth that has more than one node. For each node , we run to compute , , and when . We call these recursive calls to primary. If any of them leads to , we just set and , and we terminate the procedure. Assume henceforth that for every .
Let be the maximum value of for . Thus . We will consider several further cases, each leading to one of the following two outcomes:
We set . In that case, we let be for any node such that , and we only need to provide an appropriate path decomposition .
Let be the subtree of consisting of the root and those nodes for which . Let be the set of minimal nodes in . Suppose , say, . Let be the path from the root to in , and let . Every connected component of is a node in ; in particular, and , so is a path decomposition of of width at most . We set and apply Lemma 2.3 2 to obtain a path decomposition of of width at most .
Assume henceforth that . Let . The definition of and properties of the normalized tree decomposition of imply the following.
The sets with are pairwise disjoint and non-adjacent in . For each , all nodes of such that are descendants of in .
For each , the neighborhood of in is .
Let any path in with all internal vertices in be called a -path. Consider an auxiliary graph with vertex set where is an edge if and only if there is a -path connecting and . The graph is connected, because so is . Suppose that has a cycle or a vertex of degree at least . Then there are such that any two of them can be connected by a path in avoiding the third one. This and connectedness of the induced subgraphs imply that any two of the sets can be connected by a path in avoiding the third one. We set and .
Assume henceforth that has no cycle or vertex of degree at least , that is, is a path with . We call this the key case of the procedure . For convenience of notation, let . Every vertex in is connected by a -path to one or two consecutive sets among . Define subsets of as follows:
is the set of vertices in connected by a -path to but not to ;
for , is the set of vertices in connected by a -path to and ;
is the set of vertices in connected by a -path to but not to .
For , let , let be the set of vertices in connected by a -path to , and let . In particular, , so and . The sets () and () form a partition of , and the sets () and () cover . These definitions imply the following.
For , the following sets are subsets of : , the neighbors of in , , and the neighbors of in . Any two non-consecutive sets in the sequence are disjoint and non-adjacent.
For , let be the path from to in , and let .
Let be the family of connected components of for and of connected components of for . Suppose that for each , we have a path decomposition of of width at most . Then we apply