Reconstructing Ultrametric Trees from Noisy Experiments

The problem of reconstructing evolutionary trees or phylogenies is of great interest in computational biology. A popular model for this problem assumes that we are given the set of leaves (current species) of an unknown binary tree and the results of `experiments' on triples of leaves (a,b,c), which return the pair with the deepest least common ancestor. If the tree is assumed to be an ultrametric (i.e., all root-leaf paths have the same length), the experiment can be equivalently seen to return the closest pair of leaves. In this model, efficient algorithms are known for tree reconstruction. In reality, since the data on which these `experiments' are run is itself generated by the stochastic process of evolution, these experiments are noisy. In all reasonable models of evolution, if the branches leading to the leaves in a triple separate from each other at common ancestors that are very close to each other in the tree, the result of the experiment should be close to uniformly random. Motivated by this, we consider a model where the noise on any triple is just dependent on the three pairwise distances (referred to as distance based noise). Our results are the following: 1. Suppose the length of every edge in the unknown tree is at least Õ(1/√(n)) fraction of the length of a root-leaf path. Then, we give an efficient algorithm to reconstruct the topology of the tree for a broad family of distance-based noise models. Further, we show that if the edges are asymptotically shorter, then topology reconstruction is information-theoretically impossible. 2. Further, for a specific distance-based noise model–which we refer to as the homogeneous noise model–we show that the edge weights can also be approximately reconstructed under the same quantitative lower bound on the edge lengths.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/05/2021

Reconstructing Arbitrary Trees from Traces in the Tree Edit Distance Model

In this paper, we consider the problem of reconstructing trees from trac...
research
11/01/2018

An O(n n) time Algorithm for computing the Path-length Distance between Trees

Tree comparison metrics have proven to be an invaluable aide in the reco...
research
11/02/2018

Optimal Sequence Length Requirements for Phylogenetic Tree Reconstruction with Indels

We consider the phylogenetic tree reconstruction problem with insertions...
research
05/05/2023

Tighter Approximation for the Uniform Cost-Distance Steiner Tree Problem

Uniform cost-distance Steiner trees minimize the sum of the total length...
research
03/31/2021

Ancestral state reconstruction with large numbers of sequences and edge-length estimation

Likelihood-based methods are widely considered the best approaches for r...
research
10/22/2021

Identifiability of local and global features of phylogenetic networks from average distances

Phylogenetic networks extend phylogenetic trees to model non-vertical in...
research
07/18/2017

Efficient and consistent inference of ancestral sequences in an evolutionary model with insertions and deletions under dense taxon sampling

In evolutionary biology, the speciation history of living organisms is r...

Please sign up or login with your details

Forgot password? Click here to reset