Gradients do grow on trees: a linear-time O( N )-dimensional gradient for statistical phylogenetics

05/29/2019
by   Xiang Ji, et al.
0

Calculation of the log-likelihood stands as the computational bottleneck for many statistical phylogenetic algorithms. Even worse is its gradient evaluation, often used to target regions of high probability. Order O( N )-dimensional gradient calculations based on the standard pruning algorithm require O( N^2 ) operations where N is the number of sampled molecular sequences. With the advent of high-throughput sequencing, recent phylogenetic studies have analyzed hundreds to thousands of sequences, with an apparent trend towards even larger data sets as a result of advancing technology. Such large-scale analyses challenge phylogenetic reconstruction by requiring inference on larger sets of process parameters to model the increasing data heterogeneity. To make this tractable, we present a linear-time algorithm for O( N )-dimensional gradient evaluation and apply it to general continuous-time Markov processes of sequence substitution on a phylogenetic tree without a need to assume either stationarity or reversibility. We apply this approach to learn the branch-specific evolutionary rates of three pathogenic viruses: West Nile virus, Dengue virus and Lassa virus. Our proposed algorithm significantly improves inference efficiency with a 126- to 234-fold increase in maximum-likelihood optimization and a 16- to 33-fold computational performance increase in a Bayesian framework.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/08/2023

Many-core algorithms for high-dimensional gradients on phylogenetic trees

The rapid growth in genomic pathogen data spurs the need for efficient i...
research
10/25/2021

Scalable Bayesian divergence time estimation with ratio transformations

Divergence time estimation is crucial to provide temporal signals for da...
research
06/12/2019

Markov-modulated continuous-time Markov chains to identify site- and branch-specific evolutionary variation

Markov models of character substitution on phylogenies form the foundati...
research
06/11/2019

Relaxed random walks at scale

Relaxed random walk (RRW) models of trait evolution introduce branch-spe...
research
07/23/2021

Gain-loss-duplication models on a phylogeny: exact algorithms for computing the likelihood and its gradient

Gene gain-loss-duplication models are commonly based on continuous-time ...
research
03/23/2023

Random-effects substitution models for phylogenetics via scalable gradient approximations

Phylogenetic and discrete-trait evolutionary inference depend heavily on...

Please sign up or login with your details

Forgot password? Click here to reset