Bayesian biclustering for microbial metagenomic sequencing data via multinomial matrix factorization

05/17/2020
by   Fangting Zhou, et al.
0

High-throughput sequencing technology provides unprecedented opportunities to quantitatively explore human gut microbiome and its relation to diseases. Microbiome data are compositional, sparse, noisy, and heterogeneous, which pose serious challenges for statistical modeling. We propose an identifiable Bayesian multinomial matrix factorization model to infer overlapping clusters on both microbes and hosts. The proposed method represents the observed over-dispersed zero-inflated count matrix as Dirichlet-multinomial mixtures on which latent cluster structures are built hierarchically. Under the Bayesian framework, the number of clusters is automatically determined and available information from a taxonomic rank tree of microbes is naturally incorporated, which greatly improves the interpretability of our findings. We demonstrate the utility of the proposed approach by comparing to four alternative methods in simulations. An application to a human gut microbiome dataset involving patients with inflammatory bowel disease reveals interesting clusters, which contain bacteria families Bacteroidaceae, Bifidobacteriaceae, Enterobacteriaceae, Fusobacteriaceae, Lachnospiraceae, Ruminococcaceae, Pasteurellaceae, and Porphyromonadaceae that are known to be related to the inflammatory bowel disease and its subtypes according to biological literature. These bacteria families can become potential targets for microbiome-based treatment of the inflammatory bowel disease.

READ FULL TEXT

page 26

page 29

research
09/04/2018

Bayesian Double Feature Allocation for Phenotyping with Electronic Health Records

We propose a categorical matrix factorization method to infer latent dis...
research
02/23/2023

A Bayesian Zero-Inflated Dirichlet-Multinomial Regression Model for Multivariate Compositional Count Data

The Dirichlet-multinomial (DM) distribution plays a fundamental role in ...
research
02/16/2023

A Bayesian Perspective for Determinant Minimization Based Robust Structured Matrix Factorizatio

We introduce a Bayesian perspective for the structured matrix factorizat...
research
05/17/2023

Automatic Hyperparameter Tuning in Sparse Matrix Factorization

We study the problem of hyperparameter tuning in sparse matrix factoriza...
research
03/03/2021

PIntMF: Penalized Integrative Matrix Factorization Method for Multi-Omics Data

It is more and more common to explore the genome at diverse levels and n...
research
12/13/2022

Accelerated structured matrix factorization

Matrix factorization exploits the idea that, in complex high-dimensional...
research
02/23/2019

Bayesian Modeling of Microbiome Data for Differential Abundance Analysis

The advances of next-generation sequencing technology have accelerated s...

Please sign up or login with your details

Forgot password? Click here to reset