Hybrid Genetic Algorithm and Lasso Test Approach for Inferring Well Supported Phylogenetic Trees based on Subsets of Chloroplastic Core Genes

04/20/2015
by   Bassam AlKindy, et al.
0

The amount of completely sequenced chloroplast genomes increases rapidly every day, leading to the possibility to build large scale phylogenetic trees of plant species. Considering a subset of close plant species defined according to their chloroplasts, the phylogenetic tree that can be inferred by their core genes is not necessarily well supported, due to the possible occurrence of "problematic" genes (i.e., homoplasy, incomplete lineage sorting, horizontal gene transfers, etc.) which may blur phylogenetic signal. However, a trustworthy phylogenetic tree can still be obtained if the number of problematic genes is low, the problem being to determine the largest subset of core genes that produces the best supported tree. To discard problematic genes and due to the overwhelming number of possible combinations, we propose an hybrid approach that embeds both genetic algorithms and statistical tests. Given a set of organisms, the result is a pipeline of many stages for the production of well supported phylogenetic trees. The proposal has been applied to different cases of plant families, leading to encouraging results for these families.

READ FULL TEXT
research
06/25/2017

Well-supported phylogenies using largest subsets of core-genes by discrete particle swarm optimization

The number of complete chloroplastic genomes increases day after day, ma...
research
01/18/2020

Computing the probability of gene trees concordant with the species tree in the multispecies coalescent

The multispecies coalescent process models the genealogical relationship...
research
04/22/2017

Species tree estimation using ASTRAL: how many genes are enough?

Species tree reconstruction from genomic data is increasingly performed ...
research
11/17/2017

Quarnet inference rules for level-1 networks

An important problem in phylogenetics is the construction of phylogeneti...
research
12/20/2018

On the variance of internode distance under the multispecies coalescent

We consider the problem of estimating species trees from unrooted gene t...
research
07/13/2020

Species tree estimation under joint modeling of coalescence and duplication: sample complexity of quartet methods

We consider species tree estimation under a standard stochastic model of...
research
06/25/2017

Finding optimal finite biological sequences over finite alphabets: the OptiFin toolbox

In this paper, we present a toolbox for a specific optimization problem ...

Please sign up or login with your details

Forgot password? Click here to reset