Species tree estimation using ASTRAL: how many genes are enough?

04/22/2017
by   Shubhanshu Shekhar, et al.
0

Species tree reconstruction from genomic data is increasingly performed using methods that account for sources of gene tree discordance such as incomplete lineage sorting. One popular method for reconstructing species trees from unrooted gene tree topologies is ASTRAL. In this paper, we derive theoretical sample complexity results for the number of genes required by ASTRAL to guarantee reconstruction of the correct species tree with high probability. We also validate those theoretical bounds in a simulation study. Our results indicate that ASTRAL requires O(f^-2 n) gene trees to reconstruct the species tree correctly with high probability where n is the number of species and f is the length of the shortest branch in the species tree. Our simulations, which are the first to test ASTRAL explicitly under the anomaly zone, show trends consistent with the theoretical bounds and also provide some practical insights on the conditions where ASTRAL works well.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/13/2020

Species tree estimation under joint modeling of coalescence and duplication: sample complexity of quartet methods

We consider species tree estimation under a standard stochastic model of...
research
12/20/2018

On the variance of internode distance under the multispecies coalescent

We consider the problem of estimating species trees from unrooted gene t...
research
01/18/2020

Computing the probability of gene trees concordant with the species tree in the multispecies coalescent

The multispecies coalescent process models the genealogical relationship...
research
03/07/2018

Long-branch attraction in species tree estimation: inconsistency of partitioned likelihood and topology-based summary methods

With advances in sequencing technologies, there are now massive amounts ...
research
05/07/2019

Tree-based Reconstruction of Ecological Network from Abundance Data

The behavior of ecological systems mainly relies on the interactions bet...
research
12/04/2019

A New Paradigm for Identifying Reconciliation-Scenario Altering Mutations Conferring Environmental Adaptation

An important goal in microbial computational genomics is to identify cru...

Please sign up or login with your details

Forgot password? Click here to reset