Bayesian structural learning of microbiota systems from count metagenomic data

03/18/2022
by   Veronica Vinciotti, et al.
0

Metagenomics combined with high-resolution sequencing techniques have enabled researchers to study the genomes of entire microbial communities. Unraveling interactions between these communities is of vital importance to understand how microbes influence human health and disease. However, learning these interactions from microbiome data is challenging, due to the high dimensionality, discreteness, broad dispersion levels, compositionality and excess of zero counts that characterize these data. In this paper, we develop a copula graphical model for structure learning in these settings. In particular, we advocate the use of discrete Weibull regression for linking the marginal distributions to external covariates, which are often available in genomic studies but rarely used for network inference, coupled with a Gaussian copula to model the joint distribution of the counts. An efficient Bayesian procedure for structural learning is implemented in the R package BDgraph and returns inference of the marginals and of the dependency structure, providing simultaneous differential analysis and graph uncertainty estimates. A simulation study and a real data analysis of microbiome data show the usefulness of the proposed approach at inferring networks from high-dimensional count data in general, and its relevance in the context of microbiota data analyses in particular.

READ FULL TEXT
research
11/24/2020

Structure learning for zero-inflated counts, with an application to single-cell RNA sequencing data

The problem of estimating the structure of a graph from observed data is...
research
04/04/2023

Random graphical model of microbiome interactions in related environments

The microbiome constitutes a complex microbial ecology of interacting co...
research
05/11/2021

Phylogenetically informed Bayesian truncated copula graphical models for microbial association networks

Microorganisms play a critical role in host health. The advancement of h...
research
10/10/2018

Empirical Bayes to assess ecological diversity and similarity with overdispersion in multivariate counts

The assessment of diversity and similarity is relevant in monitoring the...
research
05/03/2021

Analysis of zero inflated dichotomous variables from a Bayesian perspective: Application to occupational health

This work proposes a new methodology to fit zero inflated Bernoulli data...
research
07/01/2021

Two edge-count tests and relevance analysis in k high-dimensional samples

For the task of relevance analysis, the conventional Tukey's test may be...
research
05/29/2020

Multiresolution Decomposition of Areal Count Data

Multiresolution decomposition is commonly understood as a procedure to c...

Please sign up or login with your details

Forgot password? Click here to reset