Incorporating compositional heterogeneity into Lie Markov models for phylogenetic inference

07/16/2020
by   Naomi E. Hannaford, et al.
0

Phylogenetics uses alignments of molecular sequence data to learn about evolutionary trees. Substitutions in sequences are modelled through a continuous-time Markov process, characterised by an instantaneous rate matrix, which standard models assume is time-reversible and stationary. These assumptions are biologically questionable and induce a likelihood function which is invariant to a tree's root position. This hampers inference because a tree's biological interpretation depends critically on where it is rooted. Relaxing both assumptions, we introduce a model whose likelihood can distinguish between rooted trees. The model is non-stationary, with step changes in the instantaneous rate matrix at each speciation event. Exploiting recent theoretical work, each rate matrix belongs to a non-reversible family of Lie Markov models. These models are closed under matrix multiplication, so our extension offers the conceptually appealing property that a tree and all its sub-trees could have arisen from the same family of non-stationary models. We adopt a Bayesian approach, describe an MCMC algorithm for posterior inference and provide software. The biological insight that our model can provide is illustrated through an analysis in which non-reversible but stationary, and non-stationary but reversible models cannot identify a plausible root.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/27/2021

An Efficient Reversible Algorithm for Linear Regression

This paper presents an efficient reversible algorithm for linear regress...
research
10/21/2020

Efficient Bayesian inference of fully stochastic epidemiological models with applications to COVID-19

Epidemiological forecasts are beset by uncertainties in the generative m...
research
03/21/2019

Irreversible Langevin MCMC on Lie Groups

It is well-known that irreversible MCMC algorithms converge faster to th...
research
03/21/2023

Identifiability of the Rooted Tree Parameter under the Cavender-Farris-Neyman Model with a Molecular Clock

Identifiability of the discrete tree parameter is a key property for phy...
research
06/12/2019

Markov-modulated continuous-time Markov chains to identify site- and branch-specific evolutionary variation

Markov models of character substitution on phylogenies form the foundati...
research
07/05/2019

An Approximate Bayesian Approach to Surprise-Based Learning

Surprise-based learning allows agents to adapt quickly in non-stationary...

Please sign up or login with your details

Forgot password? Click here to reset