A zero-inflated Bayesian nonparametric approach for identifying differentially abundant taxa in multigroup microbiome data with covariates

06/21/2022
by   Archie Sachdeva, et al.
0

Scientific studies conducted during the last two decades have established the central role of the microbiome in disease and health. Differential abundance analysis aims to identify microbial taxa associated with two or more sample groups defined by attributes such as disease subtype, geography, or environmental condition. The results, in turn, help clinical practitioners and researchers diagnose disease and develop new treatments more effectively. However, detecting differential abundance is uniquely challenging due to the high dimensionality, collinearity, sparsity, and compositionality of microbiome data. Further, there is a critical need for unified statistical approaches that can directly compare more than two groups and appropriately adjust for covariates. We develop a zero-inflated Bayesian nonparametric (ZIBNP) methodology that meets the multipronged challenges posed by microbiome data and identifies differentially abundant taxa in two or more groups, while also accounting for sample-specific covariates. The proposed hierarchical model flexibly adapts to unique data characteristics, casts the typically high proportion of zeros in a censoring framework, and mitigates high dimensionality and collinearity issues by utilizing the dimension reducing property of the semiparametric Chinese restaurant process. The approach relates the microbiome sampling depths to inferential precision and conforms with the compositional nature of microbiome data. In simulation studies and in the analyses of the CAnine Microbiome during Parasitism (CAMP) data on infected and uninfected dogs, and the Global Gut microbiome data on human subjects belonging to three geographical regions, we compare ZIBNP with established statistical methods for differential abundance analysis in the presence of covariates.

READ FULL TEXT

page 14

page 20

research
02/23/2019

Bayesian Modeling of Microbiome Data for Differential Abundance Analysis

The advances of next-generation sequencing technology have accelerated s...
research
04/12/2021

A smoothed and probabilistic PARAFAC model with covariates

Analysis and clustering of multivariate time-series data attract growing...
research
11/30/2022

A Pseudo-Value Regression Approach for Differential Network Analysis of Co-Expression Data

The differential network (DN) analysis identifies changes in measures of...
research
04/18/2019

Testing for differential abundance in compositional counts data, with application to microbiome studies

In order to identify which taxa differ in the microbiome community acros...
research
11/03/2017

Bayesian Nonparametric Mixed Effects Models in Microbiome Data Analysis

Detecting associations between microbial composition and sample characte...
research
10/25/2021

RZiMM-scRNA: A regularized zero-inflated mixture model framework for single-cell RNA-seq data

Applications of single-cell RNA sequencing in various biomedical researc...
research
07/29/2016

The Phylogenetic LASSO and the Microbiome

Scientific investigations that incorporate next generation sequencing in...

Please sign up or login with your details

Forgot password? Click here to reset