ProgPermute: Progressive permutation for a dynamic representation of the robustness of microbiome discoveries

05/03/2020
by   Liangliang Zhang, et al.
0

Identification of significant features is a critical task in microbiome studies that is complicated by the fact that microbial data are high dimensional and heterogeneous. Masked by the complexity of the data, the problem of separating signal from noise becomes challenging and troublesome. For instance, when performing differential abundance tests, multiple testing adjustments tend to be overconservative, as the probability of a type I error (false positive) increases dramatically with the large numbers of hypotheses. We represent the significance identification problem as a dynamic process of separating signals from a randomized background. The signals and noises in this process will converge from fully mixing to clearly separating, if the original data is differential by the grouping factor. We propose the progressive permutation method to achieve this process and show the converging trend. The proposed method progressively permutes the grouping factor labels of microbiome and performs multiple differential abundance tests in each scenario. We compare the signal strength of top hits from the original data with their performance in permutations, and will observe an apparent decreasing trend if these top hits are true positives identified from the data. To help understand the robustness of the discoveries and identify best hits, we develop a user-friendly and efficient RShiny tool. Simulations and applications on real data show that the proposed method can evaluate the overall association between microbiome and the grouping factor, rank the robustness of the discovered microbes, and list the discoveries, their effect sizes, and individual abundances.

READ FULL TEXT

page 1

page 3

page 4

research
11/04/2022

Signal Recovery With Multistage Tests And Without Sparsity Constraints

A signal recovery problem is considered, where the same binary testing p...
research
06/28/2020

Dual Control of Testing Errors in High-Dimensional Data Analysis

False negative errors are of major concern in applications where missing...
research
03/16/2019

A Bottom-up Approach to Testing Hypotheses That Have a Branching Tree Dependence Structure, with False Discovery Rate Control

Modern statistical analyses often involve testing large numbers of hypot...
research
07/25/2018

Local Orthogonal-Group Testing

This work addresses approximate nearest neighbor search applied in the d...
research
04/28/2022

Generalized permutation tests

Permutation tests are an immensely popular statistical tool, used for te...
research
07/08/2020

A Tukey type trend test for repeated carcinogenicity bioassays, motivated by multiple glyphosate studies

In the last two decades, significant methodological progress to the simu...
research
06/29/2020

Modeling and Computation of High Efficiency and Efficacy Multi-Step Batch Testing for Infectious Diseases

We propose a mathematical model based on probability theory to optimize ...

Please sign up or login with your details

Forgot password? Click here to reset