Orientation of Fitch Graphs and Detection of Horizontal Gene Transfer in Gene Trees
Horizontal gene transfer events partition a gene tree T and thus, its leaf set into subsets of genes whose evolutionary history is described by speciation and duplication events alone. Indirect phylogenetic methods can be used to infer such partitions 𝒫 from sequence similarity or evolutionary distances without any a priory knowledge about the underlying tree T. In this contribution, we assume that such a partition 𝒫 of a set of genes X is given and that, independently, an estimate T of the original gene tree on X has been derived. We then ask to what extent T and the xenology information, i.e., 𝒫 can be combined to determine the horizontal transfer edges in T. We show that for each pair of genes x and y with x,y being in different parts of 𝒫, it can be decided whether there always exists or never exists a horizontal gene transfer in T along the path connecting y and the most recent common ancestor of x and y. This problem is equivalent to determining the presence or absence of the directed edge (x,y) in so-called Fitch graphs; a more fine-grained version of graphs that represent the dependencies between the sets in 𝒫. We then consider the generalization to insufficiently resolved gene trees and show that analogous results can be obtained. We show that the classification of (x,y) can be computed in constant time after linear-time preprocessing. Using simulated gene family histories, we observe empirically that the vast majority of horizontal transfer edges in the gene tree T can be recovered unambiguously.
READ FULL TEXT