Statistical summaries of unlabelled evolutionary trees and ranked hierarchical clustering trees

06/04/2021
by   Samyak Rajanala, et al.
0

Rooted and ranked binary trees are mathematical objects of great importance used to model hierarchical data and evolutionary relationships with applications in many fields including evolutionary biology and genetic epidemiology. Bayesian phylogenetic inference usually explore the posterior distribution of trees via Markov Chain Monte Carlo methods, however assessing uncertainty and summarizing distributions or samples of such trees remains challenging. While labelled phylogenetic trees have been extensively studied, relatively less literature exists for unlabelled trees which are increasingly useful, for example when one seeks to summarize samples of trees obtained with different methods, or from different samples and environments, and wishes to assess stability and generalizability of these summaries. In our paper, we exploit recently proposed distance metrics of unlabelled ranked binary trees and unlabelled ranked genealogies (equipped with branch lengths) to define the Frechet mean and variance as summaries of these tree distributions. We provide an efficient combinatorial optimization algorithm for computing the Frechet mean from a sample of or distribution on unlabelled ranked tree shapes and unlabelled ranked genealogies. We show the applicability of our summary statistics for studying popular tree distributions and for comparing the SARS-CoV-2 evolutionary trees across different locations during the COVID-19 epidemic in 2020.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/16/2021

How trustworthy is your tree? Bayesian phylogenetic effective sample size through the lens of Monte Carlo error

Bayesian inference is a popular and widely-used approach to infer phylog...
research
08/25/2018

Ranked Schröder Trees

In biology, a phylogenetic tree is a tool to represent the evolutionary ...
research
04/02/2019

A rearrangement distance for fully-labelled trees

The problem of comparing trees representing the evolutionary histories o...
research
01/07/2021

The Geometry of the space of Discrete Coalescent Trees

Computational inference of dated evolutionary histories relies upon vari...
research
05/30/2023

Bayesian Decision Trees Inspired from Evolutionary Algorithms

Bayesian Decision Trees (DTs) are generally considered a more advanced a...
research
10/17/2022

A Mixing Time Lower Bound for a Simplified Version of BART

Bayesian Additive Regression Trees (BART) is a popular Bayesian non-para...
research
11/29/2021

A white-boxed ISSM approach to estimate uncertainty distributions of Walmart sales

We present our solution for the M5 Forecasting - Uncertainty competition...

Please sign up or login with your details

Forgot password? Click here to reset