It's All Relative: New Regression Paradigm for Microbiome Compositional Data

11/11/2020
by   Gen Li, et al.
0

Microbiome data are complex in nature, involving high dimensionality, compositionally, zero inflation, and taxonomic hierarchy. Compositional data reside in a simplex that does not admit the standard Euclidean geometry. Most existing compositional regression methods rely on transformations that are inadequate or even inappropriate in modeling data with excessive zeros and taxonomic structure. We develop a novel relative-shift regression framework that directly uses compositions as predictors. The new framework provides a paradigm shift for compositional regression and offers a superior biological interpretation. New equi-sparsity and taxonomy-guided regularization methods and an efficient smoothing proximal gradient algorithm are developed to facilitate feature aggregation and dimension reduction in regression. As a result, the framework can automatically identify clinically relevant microbes even if they are important at different taxonomic levels. A unified finite-sample prediction error bound is developed for the proposed regularized estimators. We demonstrate the efficacy of the proposed methods in extensive simulation studies. The application to a preterm infant study reveals novel insights of association between the gut microbiome and neurodevelopment.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/20/2018

A folded model for compositional data analysis

A folded type model is developed for analyzing compositional data. The p...
research
12/21/2018

Primal path algorithm for compositional data analysis

Compositional data have two unique characteristics compared to typical m...
research
05/02/2022

Reproducing Kernels and New Approaches in Compositional Data Analysis

Compositional data, such as human gut microbiomes, consist of non-negati...
research
06/25/2019

Simultaneous Variable Selection, Clustering, and Smoothing in Function on Scalar Regression

We address the problem of multicollinearity in a function-on-scalar regr...
research
08/30/2023

Hypothesis-driven mediation analysis for compositional data: an application to gut microbiome

Biological sequencing data consist of read counts, e.g. of specified tax...

Please sign up or login with your details

Forgot password? Click here to reset