Controlling FDR while highlighting distinct discoveries

09/06/2018
by   Eugene Katsevich, et al.
0

Often modern scientific investigations start by testing a very large number of hypotheses in an effort to comprehensively mine the data for possible discoveries. Multiplicity adjustment strategies are employed to ensure replicability of the results of this broad search. Furthermore, in many cases, discoveries are subject to a second round of filtering, where researchers select the rejected hypotheses that better represent distinct and interpretable findings for reporting and follow-up. For example, in genetic studies, one DNA variant is often chosen to represent a group of neighboring polymorphisms, all apparently associated to a trait of interest. Unfortunately the guarantees of false discovery rate (FDR) control that might be true for the initial set of findings do not translate to this second filtered set. Indeed we observe that some filters used in practice have a tendency of keeping a larger fraction of nulls than non-nulls, thereby inflating the FDR. To overcome this, we introduce Focused BH, a multiple testing procedure that accounts for the filtering step, allowing the researcher to rely on the data and on the results of testing to filter the rejection set, while assuring FDR control under a range of assumptions on the filter and the p-value dependency structure. Simulations illustrate that FDR control on the filtered set of discoveries is obtained without substantial power loss and that the procedure is robust to violations of our theoretical assumptions. Notable applications of Focused BH include control of the outer node FDR when testing hypotheses on a tree.

READ FULL TEXT
research
02/03/2019

Optimal FDR control in the two-group model

The highly influential two group model in testing a large number of stat...
research
12/10/2015

The p-filter: multi-layer FDR control for grouped hypotheses

In many practical applications of multiple hypothesis testing using the ...
research
03/16/2019

A Bottom-up Approach to Testing Hypotheses That Have a Branching Tree Dependence Structure, with False Discovery Rate Control

Modern statistical analyses often involve testing large numbers of hypot...
research
06/16/2023

Catch me if you can: Signal localization with knockoff e-values

We consider problems where many, somewhat redundant, hypotheses are test...
research
03/14/2018

A Unified View of False Discovery Rate Control: Reconciliation of Bayesian and Frequentist Approaches

This paper explores the intrinsic connections between the Bayesian false...
research
11/03/2017

NeuralFDR: Learning Discovery Thresholds from Hypothesis Features

As datasets grow richer, an important challenge is to leverage the full ...
research
03/20/2021

Distance Assisted Recursive Testing

In many applications, a large number of features are collected with the ...

Please sign up or login with your details

Forgot password? Click here to reset