Distance Assisted Recursive Testing

03/20/2021
by   Xuechan Li, et al.
0

In many applications, a large number of features are collected with the goal to identify a few important ones. Sometimes, these features lie in a metric space with a known distance matrix, which partially reflects their co-importance pattern. Proper use of the distance matrix will boost the power of identifying important features. Hence, we develop a new multiple testing framework named the Distance Assisted Recursive Testing (DART). DART has two stages. In stage 1, we transform the distance matrix into an aggregation tree, where each node represents a set of features. In stage 2, based on the aggregation tree, we set up dynamic node hypotheses and perform multiple testing on the tree. All rejections are mapped back to the features. Under mild assumptions, the false discovery proportion of DART converges to the desired level in high probability converging to one. We illustrate by theory and simulations that DART has superior performance under various models compared to the existing methods. We applied DART to a clinical trial in the allogeneic stem cell transplantation study to identify the gut microbiota whose abundance will be impacted by the after-transplant care.

READ FULL TEXT
research
06/18/2019

Multiple Testing Embedded in an Aggregation Tree to Identify where Two Distributions Differ

A key goal of flow cytometry data analysis is to identify the subpopulat...
research
01/31/2022

Communication-Efficient Distributed Multiple Testing for Large-Scale Inference

The Benjamini-Hochberg (BH) procedure is a celebrated method for multipl...
research
04/03/2020

Graphical approaches for the control of generalised error rates

When simultaneously testing multiple hypotheses, the usual approach in t...
research
02/27/2020

False Discovery Rate Control Under General Dependence By Symmetrized Data Aggregation

We develop a new class of distribution–free multiple testing rules for f...
research
01/10/2019

On the Distance Between the Rumor Source and Its Optimal Estimate on a Regular Tree

This paper addresses the rumor source identification problem, where the ...
research
09/06/2018

Controlling FDR while highlighting distinct discoveries

Often modern scientific investigations start by testing a very large num...
research
04/07/2021

Hollow-tree Super: a directional and scalable approach for feature importance in boosted tree models

Current limitations in boosted tree modelling prevent the effective scal...

Please sign up or login with your details

Forgot password? Click here to reset