Computing the probability of gene trees concordant with the species tree in the multispecies coalescent

01/18/2020
by   Jakub Truszkowski, et al.
0

The multispecies coalescent process models the genealogical relationships of genes sampled from several species, enabling useful predictions about phenomena such as the discordance between the gene tree and the species phylogeny due to incomplete lineage sorting. Conversely, knowledge of large collections of gene trees can inform us about several aspects of the species phylogeny, such as its topology and ancestral population sizes. A fundamental open problem in this context is how to efficiently compute the probability of a gene tree topology, given the species phylogeny. Although a number of algorithms for this task have been proposed, they either produce approximate results, or, when they are exact, they do not scale to large data sets. In this paper, we present some progress towards exact and efficient computation of the probability of a gene tree topology. We provide a new algorithm that, given a species tree and the number of genes sampled for each species, calculates the probability that the gene tree topology will be concordant with the species tree. Moreover, we provide an algorithm that computes the probability of any specific gene tree topology concordant with the species tree. Both algorithms run in polynomial time and have been implemented in Python. Experiments show that they are able to analyse data sets where thousands of genes are sampled, in a matter of minutes to hours.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/22/2017

Species tree estimation using ASTRAL: how many genes are enough?

Species tree reconstruction from genomic data is increasingly performed ...
research
12/19/2019

Reconstruction of Gene Regulatory Networks usingMultiple Datasets

Motivation: Laboratory gene regulatory data for a species are sporadic. ...
research
12/04/2019

A New Paradigm for Identifying Reconciliation-Scenario Altering Mutations Conferring Environmental Adaptation

An important goal in microbial computational genomics is to identify cru...
research
05/08/2022

Assigning Species Information to Corresponding Genes by a Sequence Labeling Framework

The automatic assignment of species information to the corresponding gen...
research
10/13/2022

Visualizing Multispecies Coalescent Trees: Drawing Gene Trees Inside Species Trees

We consider the problem of drawing multiple gene trees inside a single s...
research
07/13/2020

Species tree estimation under joint modeling of coalescence and duplication: sample complexity of quartet methods

We consider species tree estimation under a standard stochastic model of...

Please sign up or login with your details

Forgot password? Click here to reset