Logistic-tree normal model for microbiome compositions

06/29/2021
by   Zhuoqun Wang, et al.
0

We introduce a probabilistic model, called the "logistic-tree normal" (LTN), for microbiome compositional data. The LTN marries two popular classes of models – the logistic-normal (LN) and the Dirichlet-tree (DT) – and inherits the key benefits of both. LN models are flexible in characterizing rich covariance structure among taxa but can be computationally prohibitive in face of high dimensionality (i.e., when the number of taxa is large) due to its lack of conjugacy to the multinomial sampling model. On the other hand, DT avoids this issue by decomposing the multinomial sampling model into a collection of binomials, one at each split of the phylogenetic tree of the taxa, and adopting a conjugate beta model for each binomial probability, but at the same time the DT incurs restrictive covariance among the taxa. In contrast, the LTN model decomposes the multinomial model into binomials as the DT does, but it jointly models the corresponding binomial probabilities using a (multivariate) LN distribution instead of betas. It therefore allows rich covariance structures as the LN models, while the decomposition of the multinomial likelihood allows conjugacy to be restored through the Pólya-Gamma augmentation. Accordingly, Bayesian inference on the LTN model can readily proceed by Gibbs sampling. Moreover, the multivariate Gaussian aspect of the model allows common techniques for effective inference on high-dimensional data – such as those based on sparsity and low-rank assumptions in the covariance structure – to be readily incorporated. Depending on the goal of the analysis, the LTN model can be used either as a standalone model or embedded into more sophisticated models. We demonstrate its use in estimating taxa covariance and in mixed-effects modeling. Finally, we carry out a case study using an LTN-based mixed-effects model to analyze a longitudinal dataset from the DIABIMMUNE project.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/26/2023

A flexible Bayesian tool for CoDa mixed models: logistic-normal distribution with Dirichlet covariance

Compositional Data Analysis (CoDa) has gained popularity in recent years...
research
09/11/2021

Microbiome subcommunity learning with logistic-tree normal latent Dirichlet allocation

Mixed-membership (MM) models such as Latent Dirichlet Allocation (LDA) h...
research
03/27/2019

Bayesian Multinomial Logistic Normal Models through Marginally Latent Matrix-T Processes

Bayesian multinomial logistic-normal (MLN) models are popular for the an...
research
01/25/2022

Bayesian Covariance Structure Modeling of Multi-Way Nested Data

A Bayesian multivariate model with a structured covariance matrix for mu...
research
10/07/2021

Estimation of Constrained Mean-Covariance of Normal Distributions

Estimation of the mean vector and covariance matrix is of central import...
research
06/30/2021

Bayesian Spanning Tree: Estimating the Backbone of the Dependence Graph

In multivariate data analysis, it is often important to estimate a graph...
research
08/09/2023

Deficiency bounds for the multivariate inverse hypergeometric distribution

The multivariate inverse hypergeometric (MIH) distribution is an extensi...

Please sign up or login with your details

Forgot password? Click here to reset