Identifiability of species network topologies from genomic sequences using the logDet distance

08/03/2021
by   Elizabeth S. Allman, et al.
0

Inference of network-like evolutionary relationships between species from genomic data must address the interwoven signals from both gene flow and incomplete lineage sorting. The heavy computational demands of standard approaches to this problem severely limit the size of datasets that may be analyzed, in both the number of species and the number of genetic loci. Here we provide a theoretical pointer to more efficient methods, by showing that logDet distances computed from genomic-scale sequences retain sufficient information to recover network relationships in the level-1 ultrametric case. This result is obtained under the Network Multispecies Coalescent model combined with a mixture of General Time-Reversible sequence evolution models across individual gene trees, but does not depend on partitioning sequences by genes. Thus under standard stochastic models statistically justifiable inference of network relationships from sequences can be accomplished without consideration of individual genes or gene trees.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/06/2022

The Tree of Blobs of a Species Network: Identifiability under the Coalescent

Inference of species networks from genomic data under the Network Multis...
research
11/27/2019

Measuring similarity between two mixture trees using mixture distance metric and algorithms

Ancestral mixture model, proposed by Chen and Lindsay (2006), is an impo...
research
05/16/2019

NANUQ: A method for inferring species networks from gene trees under the coalescent model

Species networks generalize the notion of species trees to allow for hyb...
research
12/12/2019

The Metagenomic Binning Problem: Clustering Markov Sequences

The goal of metagenomics is to study the composition of microbial commun...
research
07/07/2023

GeoPhy: Differentiable Phylogenetic Inference via Geometric Gradients of Tree Topologies

Phylogenetic inference, grounded in molecular evolution models, is essen...
research
12/20/2018

On the variance of internode distance under the multispecies coalescent

We consider the problem of estimating species trees from unrooted gene t...
research
12/18/2017

Phylogenomics with Paralogs

Phylogenomics heavily relies on well-curated sequence data sets that con...

Please sign up or login with your details

Forgot password? Click here to reset