Testing for differential abundance in compositional counts data, with application to microbiome studies

04/18/2019
by   Barak Brill, et al.
0

In order to identify which taxa differ in the microbiome community across groups, the relative frequencies of the taxa are measured for each unit in the group by sequencing PCR amplicons. Statistical inference in this setting is challenging due to the high number of taxa compared to sampled units, low prevalence of some taxa, and strong correlations between the different taxa. Moreover, the total number of sequenced reads per sample is limited by the sequencing procedure. Thus, the data is compositional: a change of a taxon's abundance in the community induces a change in sequenced counts across all taxa. The data is sparse, with zero counts present either due to biological variance or limited sequencing depth, i.e. a technical zero. For low abundance taxa, the chance for technical zeros, is non-negligible and varies between sample groups. Compositional counts data poses a problem for standard normalization techniques since technical zeros cannot be normalized in a way that ensures equality of taxon distributions across sample groups. This problem is aggravated in settings where the condition studied severely affects the microbial load of the host. We introduce a novel approach for differential abundance testing of compositional data, with a non-neglible amount of "zeros". Our approach uses a set of reference taxa, which are non-differentially abundant. We suggest a data-adaptive approach for identifying a set of reference taxa from the data. We demonstrate that existing methods for differential abundance testing, including methods designed to address compositionality, do not provide control over the rate of false positive discoveries when the change in microbial load is vast. We demonstrate that methods using microbial load measurements do not provide valid inference, since the microbial load measured cannot adjust for technical zeros.

READ FULL TEXT
research
01/21/2021

Robust Differential Abundance Test in Compositional Data

Differential abundance tests in the compositional data are essential and...
research
04/01/2021

LinDA: Linear Models for Differential Abundance Analysis of Microbiome Compositional Data

One fundamental statistical task in microbiome data analysis is differen...
research
06/21/2019

Mediation analysis for zero-inflated mediators with applications to microbiome data

Zero-inflated data is commonly seen in biomedical research such as micro...
research
11/28/2018

High-dimensional Log-Error-in-Variable Regression with Applications to Microbial Compositional Data Analysis

In microbiome and genomic study, the regression of compositional data ha...
research
12/01/2022

Compositional Covariance Shrinkage and Regularised Partial Correlations

We propose an estimation procedure for covariation in wide compositional...
research
06/15/2021

Multi-sample estimation of centered log-ratio matrix in microbiome studies

In microbiome studies, one of the ways of studying bacterial abundances ...

Please sign up or login with your details

Forgot password? Click here to reset