DeepAI
Log In Sign Up

Treewidth of display graphs: bounds, brambles and applications

Phylogenetic trees and networks are leaf-labelled graphs used to model evolution. Display graphs are created by identifying common leaf labels in two or more phylogenetic trees or networks. The treewidth of such graphs is bounded as a function of many common dissimilarity measures between phylogenetic trees and this has been leveraged in fixed parameter tractability results. Here we further elucidate the properties of display graphs and their interaction with treewidth. We show that it is NP-hard to recognize display graphs, but that display graphs of bounded treewidth can be recognized in linear time. Next we show that if a phylogenetic network displays (i.e. topologically embeds) a phylogenetic tree, the treewidth of their display graph is bounded by a function of the treewidth of the original network (and also by various other parameters). In fact, using a bramble argument we show that this treewidth bound is sharp up to an additive term of 1. We leverage this bound to give an FPT algorithm, parameterized by treewidth, for determining whether a network displays a tree, which is an intensively-studied problem in the field. We conclude with a discussion on the future use of display graphs and treewidth in phylogenetics.

READ FULL TEXT VIEW PDF
06/21/2019

The size Ramsey number of graphs with bounded treewidth

A graph G is Ramsey for a graph H if every 2-colouring of the edges of G...
10/18/2022

First Order Logic on Pathwidth Revisited Again

Courcelle's celebrated theorem states that all MSO-expressible propertie...
08/30/2019

Computational Complexity of Hedonic Games on Sparse Graphs

The additively separable hedonic game (ASHG) is a model of coalition for...
12/27/2017

A Short Note on Parameterized Computation of Network Reliability with respect to Treewidth

We consider the classic problem of network reliability. A network is giv...
06/05/2014

Advances in Learning Bayesian Networks of Bounded Treewidth

This work presents novel algorithms for learning Bayesian network struct...
02/15/2022

The treewidth and pathwidth of graph unions

For two graphs G_1 and G_2 on the same vertex set [n]:={1,2, …, n}, and ...
12/21/2020

Recoloring graphs of treewidth 2

Two (proper) colorings of a graph are adjacent if they differ on exactly...

1 Introduction

A phylogenetic tree on a set of species (or, more abstractly, taxa) is a tree whose leaves are bijectively labelled by . The central idea of such structures is that internal nodes represent hypothetical ancestors of [38]. In this way, the tree can be viewed as a summary of how evolved over time. Here we focus on unrooted, binary trees: internal nodes all have degree 3, and there is no direction on the edges of the tree. This is not an onerous restriction, since many phylogenetic inference methods construct unrooted, binary trees. We refer the reader to [41, 18] for further background on phylogenetics.

In this article we study display graphs. Simply put, a display graph is obtained from two or more phylogenetic trees by identifying leaves with the same label [12, 42, 34]. Display graphs have attracted interest in recent years because of the phenomenon that, if two or more phylogenetic trees are (in some formal sense) “similar”, the treewidth of their display graph is bounded by a function of various parameters. For example, by the number of trees that form the display graph [12], or by the Tree Bisection and Reconnect (TBR) distance of two trees [34, 1].

Treewidth is a well-known graph parameter which measures, at least in an algorithmic sense, how far an undirected graph is from being a tree: many NP-hard problems can be solved in polynomial or even linear time on graphs of bounded treewidth [5, 8, 9]. Display graphs thus form a bridge from phylogenetics into algorithmic graph theory. In particular, the bounds on the treewidth of display graphs have been exploited to give fixed parameter tractable algorithms for a number of NP-hard dissimilarity measures on phylogenetic trees [12, 34, 3, 19]. (See [15] for background on fixed parameter tractability). Display graphs have also turned out to be useful for speeding up the computation of certain “easy” parameters on phylogenetic trees [16], and the treewidth of the display graph itself has also been considered as a proxy for phylogenetic dissimilarity [33, 24].

The purpose of this article is to further investigate, and algorithmically exploit, properties of the display graphs formed not only by trees, but also by trees and networks. To the best of our knowledge this is the first time tree-network display graphs have been considered. In the first part of the article, we list some basic properties of display graphs, and then address the problem of recognizing them, a problem posed in [33]. Specifically: given a cubic graph , do there exist two unrooted binary phylogenetic trees on the same set of taxa such that is the display graph of and (after suppression of degree-2 nodes)? We prove that the problem is NP-hard, by providing an equivalence with the NP-hard TreeArboricity problem [13]. On the positive side, we prove that if has bounded treewidth then this question can be answered in linear time. For this purpose we use Courcelle’s Theorem [14, 2]. This well-known meta-theorem states, essentially, that graph properties which can be expressed as a bounded-length fragment of Monadic Second Order Logic (MSOL) can be solved in linear time on graphs of bounded treewidth. We provide such an expression for recognizing display graphs.

In the second, longer part of the article, we turn our attention to display graphs formed by merging an unrooted binary phylogenetic tree with an unrooted binary phylogenetic network , both on the same set of taxa . The latter is simply an undirected graph where internal nodes have degree 3 and leaves, as usual, are bijectively labelled by . Unlike trees, networks do not need to be acyclic. We emphasize that unrooted phylogenetic networks (as defined here and in e.g. [23, 44, 21, 40]) should be viewed as undirected analogues of rooted phylogenetic networks, which correspond to directed graphs [29]. This is to distinguish them from split networks which are phylogenetic data-visualisation tools and which have a very different phylogenetic interpretation; these are sometimes also referred to as “unrooted” networks [36].

Display graphs involving networks are relevant because of the growing number of optimization problems, traditionally posed on rooted trees and networks, which are now being mapped to the unrooted setting (see e.g. [31, 44, 27, 21]). We prove that, if displays - i.e. contains a topological embedding of - the treewidth of their display graph is at most , where is the treewidth of the network . We also give alternative upper bounds for the treewidth of the display graph of and expressed in terms of a parameter more familiar to the phylogenetics community. Specifically, we give (tight) bounds in terms of the level of the original network [23] (which automatically implies bounds in terms of the weaker parameter reticulation number). Briefly, the level of a network is simply the maximum, ranging over all biconnected components of , of the number of edges in the biconnected component minus the number of edges that a spanning tree for that component has. Following [34] we use these upper bounds to give a compact MSOL-based fixed-parameter tractable algorithm for the NP-hard problem of determining whether an unrooted network displays , under various parameterizations. This problem, particularly in the rooted setting, continues to attract significant interest in the phylogenetics literature (see [26, 44, 45] for relevant references). The parameterization in terms of treewidth is potentially interesting since, as we point out, the treewidth of can be significantly lower than the level or reticulation number of .

The question arises whether the bound can be strengthened. We show that, up to the additive term, this bound is essentially sharp. We do this by providing an infinite family of networks with corresponding trees such that is displayed by and whereby the treewidth of the display graph is at least twice the treewidth of . To derive the lower bound on treewidth we crucially use brambles [39].

In the final part of the article we reflect on the potential future use of display graphs and treewidth in phylogenetics, and list a number of open problems.

2 Preliminaries

An unrooted binary phylogenetic tree on a set of leaf labels (known as taxa) is an undirected tree where all internal vertices have degree three and the leaves are bijectively labeled by . When it is understood from the context we will often drop the prefix “unrooted binary phylogenetic” for brevity. Similarly, an unrooted binary phylogenetic network on a set of leaf labels is a simple, connected, undirected graph that has degree-1 vertices that are bijectively labeled by and any other vertex has degree 3. See Figure 1 for a simple example of a tree and a network .

The reticulation number of a network is defined as , i.e., the number of edges we need to delete from in order to obtain a tree that spans . A network with is simply an unrooted phylogenetic tree. Note that in graph theory the value of a connected graph is sometimes called the cyclomatic number of the graph [17].

For a given network we define its level, denoted , as the minimum reticulation number ranging over all biconnected components of . To be consistent with the phylogenetics literature we say that is a “level- network” if (which means that they are “almost -trees” [7]). A level-0 phylogenetic network is simply a phylogenetic tree. Many NP-hard problems in phylogenetics that involve phylogenetic networks as input or output can be solved in polynomial time if the network has bounded level (or bounded reticulation number) [32, 20, 10].

We now formally define the main object of study in this article, namely the display graph:

Definition 1

Let be two trees, both on the same set of leaf labels . The display graph of , denoted by , is formed by identifying vertices with the same leaf label and forming the disjoint union of these two trees, i.e., .

Although the more general definition of display graph encountered in the literature allows the display graph to be formed by more than two trees, not necessarily on the same set of taxa (see e.g. [12]), here we will focus exclusively on the above, more restricted definition which is enough for our purposes. We note that, by construction, a display graph is always biconnected.

Note that a display graph is a labeled graph: the set bijectively labels the degree-2 nodes in the graph. In some parts of the article the labels and the degree-2 vertices are not important (because, modulo some trivial exceptions, degree-2 vertices do not impact upon the treewidth of a graph), and in such cases we work with suppressed display graphs. Such a graph is obtained by erasing the labels and repeatedly suppressing degree-2 nodes (i.e. if and are edges and has degree-2, deleting and its two incident edges and introducing the edge ). A suppressed display graph is always cubic (when . The act of suppressing degree-2 nodes can potentially create multi-edges. It is easy to see that this happens if and only if the two trees contain one or more common cherries. A cherry is a size-2 subset of taxa that have a common parent, and a cherry is common on two trees if it exists in both of them.

The definition of a display graph formed by a tree and a network , both on , is completely analogous to the definition for two trees, and is denoted as .

Let be a phylogenetic network and a phylogenetic tree, both on a common taxon set . Then we say that displays (or is displayed by ) if there exists a subtree of that is a subdivision of i.e., can be obtained by a series of edge contractions on a subgraph of . We say that is an image of . We observe that every vertex of is mapped to a vertex of , and that edges of map to paths in (perhaps consisting of only a single edge) leading us to the following observation (see also [12]):

Observation 1

If an unrooted binary phylogenetic network displays an unrooted binary phylogenetic tree , both on the same set of leaf labels , then there exists a subtree of and a surjective function from to such that:

  1. ,

  2. the subsets of induced by are mutually disjoint, and each such subset induces a connected subtree of , , the set forms a connected component in , and

  3. and .

This observation will be crucial when we study the treewidth of as a function of several parameters (including the treewidth) of .

We now move on to define the concept of the treewidth of an undirected graph:

Definition 2

Given an undirected graph , a tree decomposition of is a pair where is a multiset of bags and is a tree whose nodes are in bijection with , satisfying the following three properties:

  • ;

  • ;

  • running intersection property: all the bags that contain form a connected subtree of .

The width of is equal to . The treewidth of , denoted by , is the smallest width among all possible tree decompositions of . A tree decomposition achieving the smallest possible width for a given graph is called optimal.

If an undirected graph can be obtained from a graph by deleting vertices and edges and contracting edges, then is a minor of . It is well known that, if is a minor of a graph , then [17].

In [33] it was shown that the treewidth of the display graph of two trees can be, in the worst case, linear in the number of the vertices in the trees. In this article we will explore the relation of the treewidth of a display graph formed by a phylogenetic network and a tree displayed by that network, and the treewidth (or other parameters) of the network itself.

Finally, we define the bramble parameter of a graph, a parameter closely related to treewidth that is very useful when proving lower bounds on treewidth. Given a graph and two subgraphs of it, we say that and touch if , or some edge of has one endpoint in and the other in . A bramble of is a set of connected subgraphs of that pairwise touch. A (sub)set is a hitting set of a bramble of if intersects every element of . The order of is the minimum size of such a hitting set and the bramble number of , denoted by , is the maximum, among all possible brambles, order of a bramble of . The usefulness of brambles comes from the following result, due to Seymour & Thomas, relating the treewidth of a graph to its bramble number:

Theorem 2.1 ([39])

For any graph we have that .

3 Recognizing display graphs of pairs of trees

We consider the DisplayGraph decision problem, posed in [33]:

Input: A biconnected, cubic, simple graph .

Goal Find two unrooted binary trees , on the same set of taxa , such that the suppressed display graph of these two trees is isomorphic to , if they exist.

Note that in this formulation we can assume without any loss of generality that and do not have common cherries.

Here we will argue that the DisplayGraph problem is NP-hard by providing an equivalence between the DisplayGraph problem and the NP-hard TreeArboricity problem [13] which is defined as follows:

Input: A simple, undirected graph .

Goal Find the smallest positive integer such that there exists a partition of such that each part of the partition induces a tree, i.e., is a tree for (such a partition is called a tree partition). This is the Tree Arboricity of , also denoted as .

We emphasize that unlike some closely related variants of the problem (for example VertexArboricity [37]), it is not permitted that a induces a forest consisting of two or more components.

Chang et al. [13] discuss the decision version of the TreeArboricity problem with (i.e. is ?). The following lemma binds their problem to ours.

Lemma 1

Given a simple, connected, cubic graph as input to the TreeArboricity decision problem, is a “yes” instance for the TreeArboricity problem with if and only if is a suppressed display graph of two binary phylogenetic trees on a common set of taxa .

Proof

Given such then the partition of the set of vertices into two sets is simply . We exclude the taxa since, when we form the display graph , these will become degree-2 vertices which are subsequently suppressed. On the other hand, given a bipartition of , we can form the two phylogenetic trees on a common set of taxa whose display graph is isomorphic to as follows. First of all, by definition, are trees. Since is connected and cubic, every leaf vertex in one bipartition, say , has exactly 2 neighbor vertices in (i.e., ). Subdivide each of the edges with a new vertex in (i.e., for , replace edge with the two edges , where is a newly introduced vertex, and include which is initially empty). The points of subdivisions of these “crossing” edges (having one vertex in each bipartition) are the taxa of the new trees. Repeat the process on the remaining leaf vertices from . The same argumentation will also take care of the remaining degree-2 vertices in each of and . To complete the proof, we need to show that the number of the degree-1 plus the degree-2 vertices in are equal, such that the two constructed trees are binary phylogenetic trees. Indeed, this will follow because is cubic and connected and a “yes” instance to the TreeArboricity problem. Specifically, each edge not entirely in must have one endpoint in each bipartition. Thus, if we define for every vertex its “missing” degree in each tree as (where here refers to the degree of in ), then we see that i.e., both constructed trees are binary and, by construction, on the same set of taxa .

Theorem 3.1

DisplayGraph is NP-complete.

Proof

The DisplayGraph problem is easily seen to be in NP: a certificate can be the two trees that form the graph . We only need to check that , after suppressing degree-2 vertices, is isomorphic to , something that can be done in polynomial time since the graph isomorphism problem is polynomially time solvable for graphs of bounded degree [35, 25]. For hardness, [13] prove that the decision version of the TreeArboricity problem with is NP-complete when restricted to a simple, cubic, 3-connected planar graphs. Thus, let be a simple, cubic, 3-connected planar graph that is input to the TreeArboricity problem. A 3-connected graph is vacuously also a biconnected graph, so is a valid input to the DisplayGraph problem. The result follows because of the if and only if relationship described in Lemma 1.

3.1 The fixed parameter tractability of recognizing display graphs of bounded treewidth

Let be a simple, biconnected cubic graph. We will use Courcelle’s Theorem to test whether is a suppressed display graph. This will show that the question can be settled in time where is a function that depends only on the treewidth of . Specifically, when has bounded treewidth this will yield a linear time algorithm. The constant-length MSOL formulation simply tests whether . (Clearly, because is not acyclic). The MSOL formulation (and an introduction to MSOL proofs) is given in the appendix.

Theorem 3.2

Suppressed display graphs can be recognized in linear time on graphs of bounded treewidth.

Proof

This is a consequence of the correctness of the MSOL formulation described in Appendix 0.A.2 and the equivalence stated in Lemma 1.

4 Display graphs formed from trees and networks

In this section we will consider the display graph formed by an unrooted binary phylogenetic network and an unrooted binary phylogenetic tree both on the same set of taxa . We will show upper and lower bounds on the treewidth of in terms of the treewidth of and the level of (and thus also the reticulation number of ). We will also show how these upper bounds can be leveraged algorithmically to give FPT results for deciding whether a given network displays a given tree .

4.1 Treewidth upper bounds

We first relate the treewidth of the display graph with the treewidth of the network .

Lemma 2

Let be an unrooted binary phylogenetic network and an unrooted binary phylogenetic tree, both on , where . If displays then .

Proof

Since displays , we fix a subgraph of that is a subdivision of and a surjection function from to as defined in Observation 1 (in section Preliminaries). Informally, maps taxa to taxa and degree-3 vertices of to the corresponding vertex of . Each degree-2 vertex of lies on a path corresponding to an edge of ; such vertices are mapped to or , depending on how exactly the surjection was constructed.

Now, consider any tree decomposition of . Let be the width of the tree decomposition, i.e., the largest bag in the tree decomposition has size . We will construct a new tree decomposition for as follows. For each vertex we add to every bag that contains . To show that is a valid tree decomposition for we will show that it satisfies all the treewidth conditions. Condition (tw1) holds because is a surjection.

For property (tw2) we need to show that for every edge , there exists some bag . For this we use the third property of described in our observation: and . For each , let be the edge which is mapped through to e. Since , there must be a bag that contains both of . Since and , both of will be added into . For the last property (tw3) we need to show that the bags of where have been added form a connected component. For this, we use property (2) of the function : , the set forms a connected subtree in . Hence, the set of bags that contain at least one element from form a connected subtree in the tree decomposition. These are the bags to which is added, ensuring that (tw3) indeed holds for .

We now calculate the width of : Observe that the size of each bag can at most double. This can happen when every vertex in the bag is in and for every two vertices in the bag. This causes the largest bag after this operation to have size at most . That is, the width of the new decomposition is at most .

We move on and deliver a bound of the treewidth of the display graph in terms of the level of . We remind the reader that a network is a level- network if the reticulation number of each biconnected component is at most .

Lemma 3

Let be an unrooted binary phylogenetic network and an unrooted binary phylogenetic tree, both on , such that and displays . Then where is the level of .

Proof

Due to the fact that displays , there is a subgraph of that is a subdivision of . If is a spanning tree of , then keep as is. Otherwise, construct a spanning tree of by greedily adding edges to until all vertices of are spanned. At this point, contains exactly edges and consists of a subdivision of from which possibly some unlabelled pendant subtrees (i.e. pendant subtrees without taxa) are hanging.

We argue that has treewidth 2, as follows. First, note that has treewidth 2, because is trivally compatible with (and ) [12]. Now, can be obtained from by repeatedly deleting unlabelled vertices of degree 1 and suppressing unlabelled degree 2 vertices. These operations cannot increase or decrease the treewidth [33]. Hence, has treewidth 2.

For the purposes of the present proof we need a tree decomposition of of width 2 with a very particular structure which we now construct explicitly. For each vertex we create a singleton bag . For each edge we insert the bag between the two singleton bags and . Now, recall that each vertex has a unique image . For each vertex , add to the singleton bag . For each edge , consider the vertices and in . We distinguish two cases:

Case 1. If , remove the bag that lies between bags and and replace it with the pair of bags .

Case 2. If , then edge corresponds to a path in where and none of are images of vertices from . In the tree decomposition, this corresponds to the chain of bags . In this case, we add to the bag , add both and to bag , and add just to all the remaining bags in the chain.

We denote the tree decomposition by . It is immediate to verify, by construction, that the above tree decomposition is indeed a valid tree decomposition, i.e., it satisfies all the three properties (tw1)-(tw3).

Crucially, the topology of is a subdivision of : each vertex corresponds to a unique bag of , and each edge in corresponds to a unique chain of bags in . We leverage this property as follows.

Let be a non-trivial biconnected component of . (By non-trivial we mean a biconnected component containing more than 2 vertices. We do this to exclude cut edges, which are formally also biconnected components). Let . Then we have that . Combined with the fact that is a spanning tree of , it follows that we can obtain from by adding at most missing edges to (and repeating this for other non-trivial biconnected components). Let be the at most edges missing from and let be a (not necessarily minimum) minimal vertex cover of the edges in ; clearly since in the worst case we can select one distinct vertex per edge. Due to the topological structure of the vertices and edges in map unambiguously into bags and chains of bags in . We add all the vertices in to all these bags. We repeat this for each non-trivial biconnected component of . Due to the fact that has maximum degree 3, the non-trivial biconnected components of are vertex-disjoint, and hence the corresponding bags in are all disjoint. This means that, after all the non-trivial biconnected components have been processed, each bag will contain at most vertices.

It remains to show that this is indeed a valid tree decomposition for . The vertex set of is the same as that of so (tw1) is clearly satisfied. For each edge , both and are inside , so some bag (in the part of corresponding to contained and some bag contained . Given that , adding all the vertices in to all the bags (corresponding to ) ensures that some bag contains both and . Hence, (tw2) is satisfied. Regarding (tw3), observe that each vertex lies inside , so in some bag (in the part of the decomposition corresponding to ) already contained . Moreover, all the bags corresponding to induce a connected subtree of bags. Hence, adding to all these bags cannot destroy the running intersection property for . Hence, (tw3) holds.

The following observation helps to contextualize Lemmas 2 and 3.

Observation 2

Let be an unrooted binary phylogenetic network. Then .

Proof

follows by definition. To see that , it is well-known that the treewidth of a graph is equal to the maximum treewidth ranging over all biconnected components in the graph [7]. A spanning tree for each biconnected component can be obtained by deleting at most edges, by definition. A tree has treewidth 1, and adding one edge to a graph can increase its treewidth by at most 1 [7]. Hence, each biconnected component has treewidth at most 1+. (Alternatively, by observing that level- networks are almost -trees, [7, Theorem 74] can be leveraged).

The following corollary is therefore immediate.

Corollary 1

Let be an unrooted binary phylogenetic network and an unrooted binary phylogenetic tree, both on , where . If displays then .

Combining the above results yields the following:

Theorem 4.1

Let be an unrooted binary phylogenetic network and be an unrooted binary phylogenetic tree, both on . Then if displays ,

Note that, from the perspective of and the bounds and are sharp, since if then and has treewidth 2 [12]. Curiously, the treewidth bound gives 3 for this same instance: an additive error of 1. In Section 4.3 we will further analyse the sharpness of this bound.

We remark that can be arbitrarily small compared to (and ). For example, the display graph of two copies of the same tree on taxa has treewidth 2. Re-introducing taxa to turn the degree-2 vertices into degree-3 vertices, we obtain a biconnected treewidth 2 phylogenetic network with vertices and edges, so as . However, for with low the bound will potentially be stronger than .

The above bounds raise a number interesting points about the phylogenetic interpretation of treewidth. First, consider the case where a binary network does not display a given binary phylogenetic network . As we can see in Figure 1, there is a network and a tree such that does not display and yet the treewidth of their display graph is equal to the treewidth of which (as can be easily verified) is equal to three. Hence “does not display” does not necessarily cause an increase in the treewidth. On the other hand, our results from [33] show that for two incompatible unrooted binary phylogenetic trees (vacuously: neither of which displays the other, and both of which have treewidth 1) the treewidth of the display graph can be as large as linear in the size of the trees. The increase in treewidth in this situation is asymptotically maximal. So the relationship between “does not display” and treewidth is rather complex. Contrast this with the bounded growth in treewidth articulated in Theorem 4.1. Such bounded growth opens the door to algorithmic applications.

Figure 1: The network does not display the tree but the treewidth of their display graph is equal to the treewidth of , which is equal to 3. (Note also that, if in the positions of and are swapped, then does display but both the network and the new display graph will still have treewidth 3).

4.2 An algorithmic application

We give an example of how the upper bounds from the previous section can be leveraged algorithmically. The Unrooted Tree Compatibility problem (UTC) is simply the NP-hard problem of determining whether an unrooted binary phylogenetic network on displays an unrooted binary phylogenetic tree , also on . In [44] a linear kernel is described for the UTC problem and, separately, a bounded-search branching algorithm. Summarizing, these yield FPT algorithms parameterized by i.e. algorithms that can solve UTC in time at most for some function that depends only on . We emphasize that these results are more involved than the trivial FPT algorithm for the rooted version of the problem.

Here we give an FPT proof using Courcelle’s Theorem. We prove that he problem is FPT when parameterized by ). This result has not appeared in the literature before and is potentially interesting given that can be much smaller than . FPT in terms of and follow as a corollary of this, due to Observation 2.

Theorem 4.2

Given an unrooted binary phylogenetic network and an unrooted binary phylogenetic tree both on , we can determine in time whether displays , where is and .

Proof

We run Bodlaender’s linear-time FPT algorithm [6] to compute a tree decomposition of and return NO if the treewidth is larger than 111The same algorithm can be used to first compute , if it is not known.. This is correct by Lemma 3. Otherwise, we have a bound on the treewidth of in terms of . Subsequently, we construct the constant-length MSOL sentence described in Appendix 0.A.1 and apply the Arnborg et al. [2] variant of Courcelle’s Theorem [14], from which the result follows. (Note that has vertices and edges). The result can be made constructive if desired i.e. in the event of a YES answer the actual set of edge cuts in (to obtain an image of ) can be obtained.

Corollary 2

Given an unrooted binary network and an unrooted binary tree both on , we can determine in time whether displays , where and .

Proof

Immediate from Theorem 4.2 and Observation 2.

4.3 Treewidth lower bounds

In this subsection, we show that the upper bound is almost optimal, in the sense that there exist a family of display graphs such that displays and . (Note that, irrespective of whether displays , always holds because is a minor of ; see Figure 1 for examples when .)

Fix some integer and an integer such that . We will give a construction for a network and tree on a set of leaves, such that , , and displays . For the sake of convenience we will assume that is even, though the construction can easily be modified to handle cases where

is odd.

The intuition behind the construction is as follows. The network will have roughly the same structure as an grid (with rows and columns), with leaves attached to the horizontal edges. An grid has treewidth , and so also has treewidth . The tree is a long caterpillar that weaves back and forth across the rows of the grid (see Figure 4). Thus is displayed by . However, the display graph has (very roughly) the structure of a grid, and as such can be shown to have treewidth at least . We remind that a caterpillar graph is basically a tree where all degree-1 vertices are on distance 1 from a central path.

We now proceed with the formal construction.

Vertices of and taxa:

Let the taxon set . For each , will contain a leaf labelled with . The internal vertices of are for each , and for each . (Note that some of these vertices will be deleted or suppressed at the end of the construction, in order to turn into a phylogenetic network with no unlabelled leaves.)

Edges:

The edges of are as follows. For each , let be an edge in . In addition let , , , be “horizontal” edges in . For each , let be a “vertical” edge in .

Finally, we delete all unlabeled degree- vertices (namely and ), and then suppress all degree- vertices (namely and for all , as well as and for all , and the vertices and ). Note that this causes to be adjacent to for , and also to be adjacent to for . See Figure 2 for an example when .

Figure 2: The network when and .
The tree :

We next construct the tree as follows. For each , will contain a leaf labelled with . The internal vertices of are for each . For each , there is an edge . For each and there is an edge . Furthermore, for odd there is an edge , and for even there is an edge . Finally, suppress the degree- vertices and (or and when is odd). See Figure 3 for an example when .

Figure 3: The tree when and .
Figure 4: The network for , with the tree drawn in bold.
Lemma 4

is displayed by .

Proof

Let be the network derived from by deleting edges of the form , as well as edges of the form for even and for odd, and the edges , . Observe that is a subtree of , and that furthermore is a subdivision of , which can be seen by mapping internal vertices of to . See Figure 4.

This completes the construction of and . The display graph is shown in Figure 5. For convenience, we keep the same names for internal vertices of and but it will always be clear from the context which structure we are referring to. Note that after suppressing the vertices , vertices and are adjacent in .

Lemma 5

The treewidth of , , is equal to .

Proof

To prove that , we give a tree decomposition of . We first ignore the nodes because those can be added to any tree decomposition of the remaining graph by adding the bags and connecting them to any bag containing for all .

We will now give the tree decomposition (in fact path decomposition222A path-decomposition is a tree decomposition in which the underlying tree of the decomposition is a path graph.) of the remaining graph.

Start with the bag

which contains exactly nodes. We now sequentially add one node and delete another to get the path decomposition of the remaining graph. Denote the step of adding node and then deleting node by the tuple . Note that adding node results in a bag with nodes while deleting nodes resultes in another bag with nodes Then the following steps bring us to the bag :

Now we use a similar sequence of steps to go from the bag to the next :

Finally, do the following sequence of additions and deletions to the bags starting from :

Hence we get a path decomposition of minus the nodes and their incoming edges. This can be seen by inspecting when nodes are added and deleted. Nodes in the initial bag only get deleted, nodes in the final bag only get added, and all other nodes are first added then deleted, therefore we have the running intersection property. It is also clear that each node is in at least one bag, so we still have to check that each edge is represented in a bag. We consider each type of edge separately, and find a bag where the edge is represented.

  • The edges , and are in the initial bag;

  • The edges for are in the intermediate bag for the addition/deletion in the first part of the sequence;

  • for each and is in the intermediate bag for the addition/deletion ;

  • for each and is in the intermediate bag for the addition/deletion ;

  • for each and is in the intermediate bag for the addition/deletion ;

  • for each is in the intermediate bag for the addition/deletion ;

  • for each is in the intermediate bag for the addition/deletion ;

  • for each and is in the intermediate bag for the addition/deletion , this is clear when we realize that is added in the addition/deletion step or two steps before ;

  • The edges for are in the intermediate bag for the addition/deletion in the last part of the sequence;

  • The edges , and are in the final bag.

Hence our proposed tree decomposition is indeed a tree decomposition, and the treewidth of is at most .

For the lower bound, observe that the grid is a minor of . This grid has treewidth , so . Combining the upper and lower bound, we conclude that the treewidth of is exactly .

Figure 5: The display graph .

In order to show that , we use the concept of brambles. We will construct a bramble in of order . This implies that . The bramble contains the subgraphs induced by on the following sets:

  • For each and , the set

  • For each and , the set

  • The set

  • The set

We note that some of these sets contain vertices such as that were deleted or suppressed in the construction of . Such vertices should be ignored for the purposes of defining an induced subgraph. Intuitively, one may think of the graph as being split up into “rows” and “columns”, with a “column” being made up of the vertices for some fixed and all values of . A “row” either consists of all for a fixed , or all for a fixed . The set consists of all vertices in the last column, and the set consists of all vertices in the top row (except for those already in ). The sets and combine all vertices from a given row and column (except those vertices already in ). Note that is vertex-disjoint from all the other sets; this will be crucial for the lower bound on the order of .

Lemma 6

is a bramble in .

Proof

Observe that all the sets induce a connected subgraph of . (In particular, the “columns” are connected because of the edges ; also note that for the sets and are connected by the edge .) It remains to show that for each pair of sets in the sets either share a vertex or are joined by an edge with one vertex in each set.

To see that the sets and touch, observe that contains and contains , and these vertices are connected by an edge. To see that touches the other sets, observe that all other sets contain either the vertex or for some . As both of these vertices are adjacent to , it follows that is touches each of these sets.

To see that