Inferring phenotypic trait evolution on large trees with many incomplete measurements

06/07/2019
by   Gabriel Hassler, et al.
0

Comparative biologists are often interested in inferring covariation between multiple biological traits sampled across numerous related taxa. To properly study these relationships, we must control for the shared evolutionary history of the taxa to avoid spurious inference. Existing control techniques almost universally scale poorly as the number of taxa increases. An additional challenge arises as obtaining a full suite of measurements becomes increasingly difficult with increasing taxa. This typically necessitates data imputation or integration that further exacerbates scalability. We propose an inference technique that integrates out missing measurements analytically and scales linearly with the number of taxa by using a post-order traversal algorithm under a multivariate Brownian diffusion (MBD) model to characterize trait evolution. We further exploit this technique to extend the MBD model to account for sampling error or non-heritable residual variance. We test these methods to examine mammalian life history traits, prokaryotic genomic and phenotypic traits, and HIV infection traits. We find computational efficiency increases that top two orders-of-magnitude over current best practices. While we focus on the utility of this algorithm in phylogenetic comparative methods, our approach generalizes to solve long-standing challenges in computing the likelihood for matrix-normal and multivariate normal distributions with missing data at scale.

READ FULL TEXT
research
12/19/2019

Large-scale inference of correlation among mixed-type biological traits with Phylogenetic multivariate probit models

Inferring concerted changes among biological traits along an evolutionar...
research
03/23/2020

Efficient Bayesian Inference of General Gaussian Models on Large Phylogenetic Trees

Phylogenetic comparative methods correct for shared evolutionary history...
research
01/18/2022

Hamiltonian zigzag accelerates large-scale inference for conditional dependencies between complex biological traits

Inferring dependencies between complex biological traits while accountin...
research
06/24/2022

Sparse precision matrix estimation in phenotypic trait evolution models

Phylogenetic trait evolution models allow for the estimation of evolutio...
research
10/20/2020

A Comparative Study of Imputation Methods for Multivariate Ordinal Data

Missing data remains a very common problem in large datasets, including ...
research
02/18/2023

CRP-Tree: A phylogenetic association test for binary traits

An important problem in evolutionary genomics is to investigate whether ...
research
07/02/2021

Principled, practical, flexible, fast: a new approach to phylogenetic factor analysis

Biological phenotypes are products of complex evolutionary processes in ...

Please sign up or login with your details

Forgot password? Click here to reset