On using empirical null distributions in Benjamini-Hochberg procedure

12/06/2019
by   Etienne Roquain, et al.
0

When performing multiple testing, adjusting the distribution of the null hypotheses is ubiquitous in applications. However, the effect of such an operation remains largely unknown, especially in terms of false discovery proportion (FDP) and true discovery proportion (TDP) of the resulting procedure. We explore this issue in the most classical case where the null distributions are Gaussian with an unknown rescaling parameters (mean and variance) and where the Benjamini-Hochberg (BH) procedure is applied after a data-rescaling step. Our main result shows the following sparsity boundary: an asymptotically optimal rescaling (in some specific sense) exists if and only if the sparsity parameter k (number of false nulls) is of order less than n/log(n), where n is the total number of tests. Our proof relies on new non-asymptotic lower bounds on FDP/TDP, which are of independent interest and share similarities with those developed in the minimax robust statistical theory. Further sparsity boundaries are derived for general location models where the shape of the null distribution is not necessarily Gaussian.

READ FULL TEXT
research
12/06/2019

On using empirical null distribution in Benjamini-Hochberg procedure

When performing multiple testing, adjusting the distribution of the null...
research
06/25/2021

Semi-supervised multiple testing

An important limitation of standard multiple testing procedures is that ...
research
09/21/2018

Estimating minimum effect with outlier selection

We introduce one-sided versions of Huber's contamination model, in which...
research
04/23/2018

A direct approach to false discovery rates by decoy permutations

The current approaches to false discovery rates (FDRs) in multiple hypot...
research
03/06/2021

Log-Chisquared P-values under Rare and Weak Departures

Consider a multiple hypothesis testing setting in which only a small pro...
research
05/02/2022

Multiple hypothesis screening using mixtures of non-local distributions

The analysis of large-scale datasets, especially in biomedical contexts,...
research
11/29/2022

On Large-Scale Multiple Testing Over Networks: An Asymptotic Approach

This work concerns developing communication- and computation-efficient m...

Please sign up or login with your details

Forgot password? Click here to reset