Clustering microbiome data using mixtures of logistic normal multinomial models

11/12/2020
by   Yuan Fang, et al.
0

Discrete data such as counts of microbiome taxa resulting from next-generation sequencing are routinely encountered in bioinformatics. Taxa count data in microbiome studies are typically high-dimensional, over-dispersed, and can only reveal relative abundance therefore being treated as compositional. Analyzing compositional data presents many challenges because they are restricted on a simplex. In a logistic normal multinomial model, the relative abundance is mapped from a simplex to a latent variable that exists on the real Euclidean space using the additive log-ratio transformation. While a logistic normal multinomial approach brings in flexibility for modeling the data, it comes with a heavy computational cost as the parameter estimation typically relies on Bayesian techniques. In this paper, we develop a novel mixture of logistic normal multinomial models for clustering microbiome data. Additionally, we utilize an efficient framework for parameter estimation using variational Gaussian approximations (VGA). Adopting a variational Gaussian approximation for the posterior of the latent variable reduces the computational overhead substantially. The proposed method is illustrated on simulated and real datasets.

READ FULL TEXT
research
01/06/2021

Logistic Normal Multinomial Factor Analyzers for Clustering Microbiome Data

The human microbiome plays an important role in human health and disease...
research
09/07/2013

Variational Bayes Approximations for Clustering via Mixtures of Normal Inverse Gaussian Distributions

Parameter estimation for model-based clustering using a finite mixture o...
research
04/16/2019

High-dimensional copula variational approximation through transformation

Variational methods are attractive for computing Bayesian inference for ...
research
02/07/2023

Structured variational approximations with skew normal decomposable graphical models

Although there is much recent work developing flexible variational metho...
research
08/28/2018

The Sparse Latent Position Model for nonnegative weighted networks

This paper introduces a new methodology to analyse bipartite and unipart...
research
08/25/2018

Relaxing the Identically Distributed Assumption in Gaussian Co-Clustering for High Dimensional Data

A co-clustering model for continuous data that relaxes the identically d...
research
05/10/2018

Robust Model-Based Clustering of Voting Records

We explore the possibility of discovering extreme voting patterns in the...

Please sign up or login with your details

Forgot password? Click here to reset