One of the main goals in phylogenetics is to develop methods for constructing evolutionary trees, the tree-of-life being a prime example of such a tree . Mathematically speaking, for a set of species, a phylogenetic -tree is a (graph theoretical) tree with leaf set and no degree- vertices; it is binary if every internal vertex has degree three. A popular approach to constructing such trees, called the supertree method, is to build them up from smaller trees . The smallest possible trees that can be used in this approach are quartet trees, that is, binary phylogenetic trees having 4 leaves (see e.g. Figure 1 for the quartet tree with leaf-set ). Thus it is natural to ask the following question: How should we decide whether or not it possible to simultaneously display all of the quartet trees in a given collection of quartet trees by some phylogenetic tree?
In case the collection consists of a quartet tree for every possible subset of of size 4 (which we denote by ), this problem has an elegant solution that was originally presented by Colonius and Schulze in 1981  (see also  for related results). We present full-details in Theorem 2.2 below, but essentially their result states that, given a collection of quartet trees , one for each element in , there exists (a necessarily unique) binary phylogenetic -tree displaying every quartet tree in the collection if and only if when the quartet trees and are contained in then so is the quartet tree . Rules such as plus implies are known as inference rules, and they have been extensively studied in the phylogenetics literature (see e.g. [17, Chapter 6.7]).
Although phylogenetic trees are extremely useful for representing evolutionary histories, in certain circumstances they can be inadequate. For example, when two viruses recombine to form a new virus (e.g. swine flu), this is not best represented by a tree as it involves species combining together to form a new one rather than splitting apart. In such cases, phylogenetic networks provide a more accurate alternative to trees and there has been much recent work on such structures (see e.g. [19, Chapter 10] for a recent review).
In this paper, we will consider properties of a particular type of phylogenetic network called a level-1 network . For a set of species, this is a connected graph with leaf-set and such that every maximal subgraph with no cut-edge is either a vertex or a cycle (see Section 2 for more details). Our main results will apply to binary level-1 networks, where we also assume that every vertex has degree 1 or 3. We present an example of such a network in Figure 1. Note that a phylogenetic -tree is a special example of a level-1 network with leaf-set . As with phylogenetic -trees it is possible to construct level-1 networks from quartets . However, it has been pointed out that there are problems with understanding such networks in terms of inference rules (see e.g. [11, p.2540]).
Here, we circumvent these problems by considering a certain type of subnetwork of level-1 network called a quarnet instead of using quartet trees. A quarnet is a 4-leaved, binary, level-1 network (see e.g. Figure 1); they are displayed by binary level-1 networks in a similar way to quartets (see Section 3 for details). As we shall see, quarnets naturally lead to inference rules for level-1 networks which can be thought of as a combination of quartet inference and inference rules for building circular orderings of a set. Moreover, in our main result we show that, just as with phylogenetic trees, the quarnet inference rules that we introduce can be used to characterize when a collection of quarnets, one for each element in , can be displayed by a binary level-1 network with leaf-set .
We now summarize the contents of the rest of the paper. In the next section we present some preliminaries concerning phylogenetic trees and level-1 networks, as well as their relationship with quartets. Then, in Section 3, we prove an analogous theorem to the quartet results of Colonius and Schulze for level-1 networks (Theorem 3.2). In Section 4, we use Theorem 3.2 to provide a characterization for when a set of quartets, one for each element of , can be displayed by a binary level-1 network (Theorem 4.1). In Section 5, we then define the closure of a set of quarnets. This can be thought of as the collection of quarnets that is obtained by applying inference rules to a given collection of quarnets until no further quarnets are generated. We show that this has similar properties to the so-called semi-dyadic closure of a set of quartets (see Theorem 5.2). We conclude with a brief discussion of some possible further directions.
In this section, we review some definitions as well as results concerning the connection between phylogenetic trees and quartets. From now on, we assume that is a finite set with .
An unrooted phylogenetic network (on ) (or network (on ) for short) is a connected graph with , every vertex has either degree 1 or degree at least 3, and the set of degree- vertices is . The elements in are the leaves of . We also denote the leaf-set of by . The network is called binary if every vertex in has degree 1 or 3. An interior vertex of is a vertex that is not a leaf. A cherry in is a pair of leaves that are adjacent with the same vertex. Two phylogenetic networks and on are isomorphic if there exists a graph-theoretical isomorphism between and whose restriction to is the identity map.
Note that a phylogenetic (-) tree is a network which is also a tree. For any three vertices in such a tree , their median, denoted by , is the unique vertex in that is contained in every path between , and .
A cut-vertex of a network is an vertex whose removal disconnects the network, and a cut-edge of a network is an edge whose removal disconnects the network. A cut-edge is trivial if one of the connected components induced by removing the cut-edge is a vertex (which must necessarily be a leaf). A network is simple if all of the cut-edges are trivial (so for instance, note that phylogenetic trees with more than three leaves are not simple networks). A network is level-1 if every maximal subgraph in that has no cut-edge is either a vertex or a cycle. Note that we shall say that a network on , where , is of cycle-type if it contains a unique cycle of length , and the number of vertices in is (so in particular, a network is of cycle-type if it is simple, binary, level-1 and is not a phylogenetic tree).
In what follows it will be useful to consider a certain type of operation on a level-1 network, which we define as follows. For a level-1 network on , let be an interior vertex of that is not contained in any cycle in . Furthermore, let , where , be a circular ordering of the set of vertices in that are adjacent to . Then we obtain a new network on from by removing vertex and all edges incident with it and inserting new vertices and new edges and for all (see Fig. 2). Here we use the convention that is identified with 1. We say that is obtained from by a blow-up operation on (using the given circular ordering of its neighbours). Note that is a level- network with one more cycle than . Note that blow-up operations on the same vertex but with different circular orderings of its neighbours may lead to non-isomorphic networks. We illustrate a blow-up operation in Fig. 2.
2.2. Quartets, Trees and Networks
We now briefly recall some notation and results concerning quartet systems (for more details see [5, Chapter 3]).
Although quartets are often considered as being 4-leaved trees, here it is more convenient to consider a quartet to be a partition of a subset of of size 4 into two subsets of size 2. The set is called the support of . If for distinct, we denote by . The set of all quartets on is denoted by , and any non-empty subset is called a quartet system (on ). Given a quartet system on and a subset , let be the number of quartets in whose support is . For simplicity, we write as . If for every subset , then is said to be dense.
Following the terminology in , a quartet system is:
thin if no pair of quartets in have the same support;
saturated if for all with , the system contains at least one quartet in ;
transitive if for all , if holds, then is also contained in .
These concepts are related as follows:
Suppose that is a quartet system on . If is saturated and thin, then is transitive.
We use a similar argument to that used in [2, Lemma 1]. Suppose with . We need to show .
Since is saturated and is contained in , we have . Using a similar argument, in implies that . Therefore, we must have as otherwise , a contradiction to the assumption that is thin. ∎
A quartet on is displayed by a phylogenetic -tree if the path between and in is vertex disjoint from the path between and in . The quartet system displayed by is denoted by .
In view of [5, Theorem 3.7] and the last lemma, we have the following slightly stronger characterisation of quartet systems displayed by a phylogenetic tree, which was stated in [2, Proposition 2] using slightly different terminology.
A quartet system is of the form for a (necessarily unique) phylogenetic -tree if and only if is thin and saturated.
We now turn our attention to the relationship between quartets and level-1 networks.
A split of is a bipartition of into two non-empty parts and (note that since is a bipartition, order does not matter, that is, ). Such a split is induced by a network if there exists a cut-edge in whose removal results in two connected components, one with leaf-set and the other with leaf-set . A quartet is exhibited by a network if there exists a split induced by such that and .
Note that if a quartet is exhibited by , then it is displayed by , that is, contains two disjoint paths, one from to , and the other from to . However, the converse is not true. For example, quartet is displayed by the network in Fig. 4(iv), but is not exhibited by this network. Given a network , we let denote the set of quartets exhibited by , and let be the set of quartets displayed by . In light of the last remark, clearly we have .
In this section, we shall show that an analogue of Theorem 2.2 holds for quarnets and level- networks. We begin by formally defining the concept of a quarnet and how quarnets can be obtained from level-1 networks.
Given a binary, level-1 phylogenetic network on and a subset , we let denote the network induced on by , which is obtained from by deleting all edges that are not contained in some path between a pair of elements in , removing all isolated vertices, and then repeatedly applying the following two operations until neither of them is applicable (i) suppressing degree- vertices, and (ii) suppressing parallel edges. Note that is a binary, level-1 phylogenetic network on .
A trinet is a binary, level-1 phylogenetic network on three leaves. Note that there are two types of trinets: one is of cycle type; the other does not contain a cycle and is of tree type (see Fig. 3 for an illustration). Similarly, a quarnet or qnet is a binary, level-1 phylogenetic network with four leaves. The leaf-set of a qnet is called its support. As illustrated in Fig. 4, there are four types of qnets: Type I qnets contain no cycles; Type II qnets contain one cycle and one non-trival cut-edge; Type III qnets contain two cycles; and Type IV qnets contain no non-trivial cut-edge. A qnet system on is a collection of qnets all of whose supports are contained in . We shall say that a qnet with support is displayed by a network on if is isomorphic to . Moreover, we let be the qnet system displayed by , that is,
We now turn to characterizing when a qnet system is displayed by a level-1 network. To do this, we introduce some additional concepts concerning qnet systems.
First, a qnet system on is consistent (on subsets of of size three) if for all subsets , is isomorphic to , for each pair of qnets in with . In addition, a qnet system on is minimally dense if for all , there exists precisely one qnet in with support .
Now, we say that a qnet system on is cyclically-transitive or cyclative if for all subsets with , the system also contains . Note that this is closely related to the cyclic-ordering inference rule given in [1, Proposition 1]. In addition, we say that a qnet system on is saturated, if for all subsets , the following hold:
If contains , then , or , or , or is contained in .
If contains , then , or , or , or is contained in .
If contains , then , or , or , or is contained in .
We next show how these concepts are related. To prove the following result, given a qnet system , we shall consider the quartet system consisting of those quartets that are exhibited by some qnet in , which we shall denote by .
Suppose that is a qnet system on .
(i) If is minimally dense, then is thin.
(ii) If is saturated, then is saturated.
For the proof of (i), as is minimally dense, for each subset of with size four, there exists precisely one qnet in whose support is . Hence there exists at most one quartet in with support .
To prove (ii), consider a quartet in and an arbitrary element in that is distinct from . Let be a qnet in such that is the quartet exhibited by . Then is Type I, II or III. Assume first that is Type I, then . Since is saturated, by (S1),
and so one of the quartets and is contained in , as required. If is of Type II or III, then similar arguments using (S2) and (S3), respectively, show that or is contained in . ∎
We now characterize when a minimally dense set of qnets is displayed by a level-1 network.
Let be a minimally dense qnet system on with . Then for some (necessarily unique) binary, level-1 network on if and only if is consistent, cyclative and saturated.
Clearly, if holds for a binary, level-1 network , then is consistent, cyclative and saturated.
We now show that the converse holds. Suppose that is a minimally dense qnet system on that is consistent, cyclative and saturated. Consider the quartet system . By Lemma 3.1, is thin and saturated. Therefore, by Theorem 2.2, there exists a unique phylogenetic tree with .
For each interior vertex in , let denote the partition of induced by deleting from so that, in particular, the number of parts in is equal to the degree of . Note that, for all , if and , the path in between and must contain , and if , the path between and does not contain .
We next partition the set of interior vertices of . Let be the set of degree- vertices in with the property that there exist three elements, one from each distinct part of , so that there exists a qnet in whose restriction to these three elements is of cycle type. Let be the set of degree- vertices in not contained in . Lastly, let be the set of interior vertices in with degree at least 4.
A degree- vertex in is contained in if and only if, for each subset of of size three that contains precisely one element from each part of , the restriction is of cycle type for every qnet in with .
Since is minimally dense, the “if ” direction follows directly from the definition of .
Conversely, let be such that , , are all contained in distinct parts of and there exists a qnet in such that is of cycle type. Now let with all contained in distinct parts of and let be an arbitrary qnet in with . We shall show that is of cycle type by considering the size of the intersection .
First assume that , that is, . Then, as is consistent, is of cycle type since it is isomorphic to .
Second assume that . By swapping the indices, we may further assume that , , and . In other words, we have . Consider and let be the qnet in with . Since are both contained in , the quartet is contained in . As is of cycle type, this implies that is either or . In both cases is of cycle type, and hence is also of cycle type in view of the consistency of .
Next assume that . By swapping the indices, we may further assume that, for , elements and are contained in the same part of but . Consider the sets and , and put and . Then we have for . Repeatedly applying the argument used when the size of the intersection is two, it follows that is of cycle type, as required.
Lastly, the case can be established using a similar argument to that when the size of the intersection is zero. This completes the proof of the claim. ∎
Although we will not use this fact later, note that it follows from Claim 3.3 that a vertex in is contained in if and only if, for each subset of of size three whose elements are contained in distinct elements of , the restriction is a tree type for every qnet in with .
Suppose . Let be contained in distinct parts of , respectively. Then the qnet in with support is of Type IV. Moreover, if is , then, for all , , and , the qnet with support is .
Suppose is not of Type IV. Then contains precisely one quartet, denoted by , and . This implies that . However, is not contained in because the path between any pair of distinct elements in contains ; a contradiction. Thus is of Type IV.
Now, suppose . Then we may further assume without loss of generality that , , , and . Hence . Note that the argument in the last paragraph implies that is of Type IV. If is not isomorphic to , then is isomorphic to either or . In the first subcase, since is cyclative and , the qnet is contained in . This implies that the quartet is not contained in , a contradiction since are contained in while are contained in . The second subcase follows in a similar way.
Lastly, if , then note that there exists a list of 4-element subsets for some such that, for , we have and the two elements in are contained in the same part of . Claim 3.4 follows by repeatedly applying the argument in the last paragraph to the list. ∎
Using the last claim we next establish the following
For each vertex , there exists a unique circular ordering of the parts of such that, for each tuple with , the qnet in with support is isomorphic to .
In light of Claim 3.4 we can define a quaternary relation on the parts of by setting , for all distinct parts , if and only if, for all , , and , the qnet with support is .
Now, for all distinct , we show that
(BD-1): implies and ;
(BD-2): either , or , or (exclusively);
(BD-3): and implies .
Indeed, let , , , , . Then (BD-1) holds since is isomorphic to and to . Next, (BD-2) follows immediately since is minimally dense. To see (BD-3) holds, note that since and imply that and are contained in , using the fact that is cyclative implies that is in , and hence holds. Using (BD-1) it follows that , as required.
Now let , and for each vertex , fix a circular ordering of its neighbourhood induced by the ordering of in Claim 3.5 if , or the necessarily unique circular ordering (clockwise and anticlockwise are treated as the same) of if (and hence ). Let be the level-1 network obtained from by blowing up each vertex in using the given circular ordering of . We next show that . To this end, fix four arbitrary elements in and let be the qnet in with support . We need to show that . There are four cases depending upon whether is Type I, II, III, or IV.
First suppose is of Type I. Without loss of generality, we may assume that . Let . If , then are contained in three distinct parts in the partition of on . By Claim 3.3 and Claim 3.4, it follows that with is of cycle type, a contradiction. Thus and so there exists a cut-vertex in whose removal induces three connected components, containing , and respectively. Similarly, the median is contained in . Hence there exists a cut-vertex in whose removal induces three connected components, containing , and respectively. Let be the qnet in whose support is . Thus, by inspecting all possible qnets on , it follows that is isomorphic to , and hence .
Second, suppose that is of Type II. Without loss of generality, we may assume that . Let be the qnet in whose support is . Let be the median of in . Then, by an argument similar to the one used in the last paragraph, it follows that there exists a cut-vertex in (and hence also a cut-vertex in ) whose removal results in three connected components, containing , and respectively. On the other hand, let be the median of in . Then are contained in three distinct parts of . Since is of cycle type, by Claim 3.4 it follows that , which implies that is also of cycle type. Thus, by inspecting all possible qnets on , it follows that is isomorphic to , and hence .
Next, suppose that is of Type III. Without loss of generality, we may assume that . Let be the qnet in whose support is . Let be the median of in and be the median of in . Since the quartet is contained in , we know that and are distinct. Hence, there exists a cut-edge whose deletion puts and in one component and and in the other connected component. By an argument similar to that used for analysing when is of Type II, it follows that and are both of cycle type. Hence, by inspecting all possible qnets on , the qnet is isomorphic to , and hence .
Lastly, suppose that is of Type IV. Without loss of generality, we may assume that . Let be the qnet in whose support is . Hence, there exists no quartet in whose support is . Therefore, . Denoting this median by , it follows that is necessarily contained in , and hence contains vertices. Now let be the unique circular ordering of vertices induced by the circular ordering of in Claim 3.5. Without loss of generality, we may assume that . Then there exists such that . By the construction of (which locally is the blow-up at with respect to the circular ordering), it follows that is isomorphic to , and hence .
This shows that . Since and are both minimally dense, we have . Finally, the uniqueness statement concerning is a direct consequence of the uniqueness of and the unique way in which is constructed from . ∎
4. A characterization of level-1 quartet systems
We now use Theorem 3.2 to characterize when a quartet system is equal to the set of quartets displayed by a binary level-1 network. This characterization is given as Theorem 4.1. Let be a quartet system on . A quartet in is distinguished if is the only quartet in with support equal to the leaf-set of . Moreover, a network is called 3-cycle free if it does not contain any cycle consisting of three vertices.
Let be a dense quartet system on with . Then for some binary level-1 network on if and only if the following three conditions hold:
For all , we have or .
If , then , for distinct.
If is a distinguished quartet in , then, for each where are distinct, either or is a distinguished quartet in .
Moreover, if satisfies (D1)–(D3), then there exists a unique level-1, 3-cycle free network with .
It is easily checked that, if holds for some binary level-1 network , then (D1)–(D3) holds. Conversely, let be a dense quartet system satisfying (D1)–(D3). Let be the set consisting of the distinguished quartets contained in . We first associate a phylogenetic -tree to . If , then we let denote the phylogenetic -tree which contains precisely one vertex that is not a leaf (i.e. a“star tree”). If , then let be some quartet contained in , . Suppose that there exists some . Then by (D3), either or . It follows that . Moreover, as is clearly thin and by (D3) is saturated, it follows by Theorem 2.2, that there exists a phylogenetic -tree with .
Now we construct a qnet system as follows. Let be the subset of consisting of those with , and . To each we associate a qnet as follows. Swapping the labels of the elements in if necessary, we may assume that is the (necessarily unique) quartet in with leaf-set . Now let and be the median of in and , respectively. Similarly, let and be the median of in and , respectively. Then is the qnet on obtained from by performing a blow-up on each of , where , if and only if the degree of in is at least four.
We also associate a qnet to each as follows. Swapping the labels of the elements in if necessary, we may assume that the quartets in with leaf-set are and . We then define to be the qnet .
Now, let . By construction is minimally dense. Moreover, , and is cyclative in view of (D2).
Next, we shall show that is consistent. Fix a subset and consider its median in . By construction, it suffices to establish the claim that the degree of is three in if and only if, for each , the set is not contained in .
To see that this claim holds first note that if has degree three, then each of the three components of contains precisely one element in . Without loss of generality, we may assume that element is contained in the connected component containing element . But this implies that is a quartet in , and hence . On the other hand, if has degree at least four, then there exists an element such that belong to four different connected components of . Therefore, and are disjoint. This implies that is not contained in , and so it is contained in . This establishes the claim.
Next, we show that is saturated. We shall show that (S2) holds; the fact that satisfies (S1) and (S3) can be established by a similar argument. Let be a set that satisfies the condition in (S2), that is, is contained in . Then is a quartet in . Furthermore, put and , then the degree of is at least four and the degree of is three. Now, fix an element . If and are in the same connected component resulting from deleting from , then is a quartet in . Since the median of in has degree three, by construction either or (but not both) is contained in . Otherwise, is a quartet in . Since the median of in has degree greater than three, by construction we can conclude that either or is contained in (but not both). This completes the verification of (S2).
It follows that is minimally dense, cyclative, consistent and saturated. By Theorem 3.2, there exists a unique binary level-1 network on such that . By construction, it also follows that . The uniqueness statement in the theorem follows from the uniqueness of and the fact that for two binary level-1 networks and if and only if and on differ only by 3-cycles (see e.g. [11, Lemma 2]). ∎
5. Quarnet inference rules and closure
For a quartet system on , we write precisely if every phylogenetic -tree that displays also displays . The statement is known as a quartet inference rule . A well-known example of such a rule is
which leads to the concept of the semi-dyadic closure of the set , that is, the minimal set of quartets that contains and has the property that if , then .
In this section, we define analogous concepts for qnets and show that they have similar properties to those enjoyed by phylogenetic trees. If is a qnet system, we write for some qnet if every binary level-1 network that displays also displays . Now, let denote symbols in . For example, is equivalent to when and . We introduce three qnet inference rules on :
for all ;
and and for all ;
We illustrate two of these rules in Figure 5.
Using Theorem 3.2, it is straightforward to show that the above three rules are well defined. That is, given three qnets , and such that holds for one of the above three rules, then every binary level-1 network that displays must display .
For a qnet system , we define the set to be the minimal qnet system (under set-inclusion) that contains such that if holds under (CL1)-(CL3), then holds. We call the closure of .
The following key proposition is analogous to that for semi-dyadic closure for quartet systems (cf.  and [14, Proposition 2.1]). It follows from the fact that the closure of a qnet system can clearly be obtained from by repeatedly applying the qnet rules (CL1)–(CL3) until the sequence of sets so obtained stabilizes. Note that this process must clearly terminate in polynomial time.
Let be a qnet system and let be a binary, level-1 network. Then displays if and only if displays .
We now show that behaves in a similar way to the semi-dyadic closure of a quartet system (cf.[17, Exercise 19, p. 143]).
Suppose that is a minimally dense, consistent set of qnets on with . Then the following statements are equivalent:
holds for a (necessarily unique) binary, level-1 network on ;
For every 3-element subset of , the subset is displayed by some binary level-1 network on .
The fact that (i) implies (ii) and (i) implies (iii) are straightforward. We complete the proof by showing that (ii) implies (i) and (iii) implies (i).
For the proof of (ii) implies (i), suppose that