Contextual Online False Discovery Rate Control

02/07/2019
by   Shiyun Chen, et al.
0

Multiple hypothesis testing, a situation when we wish to consider many hypotheses, is a core problem in statistical inference that arises in almost every scientific field. In this setting, controlling the false discovery rate (FDR), which is the expected proportion of type I error, is an important challenge for making meaningful inferences. In this paper, we consider the problem of controlling FDR in an online manner. Concretely, we consider an ordered, possibly infinite, sequence of hypotheses, arriving one at each timestep, and for each hypothesis we observe a p-value along with a set of features specific to that hypothesis. The decision whether or not to reject the current hypothesis must be made immediately at each timestep, before the next hypothesis is observed. The model of multi-dimensional feature set provides a very general way of leveraging the auxiliary information in the data which helps in maximizing the number of discoveries. We propose a new class of powerful online testing procedures, where the rejections thresholds (significance levels) are learnt sequentially by incorporating contextual information and previous results. We prove that any rule in this class controls online FDR under some standard assumptions. We then focus on a subclass of these procedures, based on weighting significance levels, to derive a practical algorithm that learns a parametric weight function in an online fashion to gain more discoveries. We also theoretically prove, in a stylized setting, that our proposed procedures would lead to an increase in the achieved statistical power over a popular online testing procedure proposed by Javanmard & Montanari (2018). Finally, we demonstrate the favorable performance of our procedure, by comparing it to state-of-the-art online multiple testing procedures, on both synthetic data and real data generated from different applications.

READ FULL TEXT
research
03/29/2016

Online Rules for Control of False Discovery Rate and False Discovery Exceedance

Multiple hypothesis testing is a core problem in statistical inference a...
research
11/03/2017

NeuralFDR: Learning Discovery Thresholds from Hypothesis Features

As datasets grow richer, an important challenge is to leverage the full ...
research
08/24/2022

Online multiple hypothesis testing for reproducible research

Modern data analysis frequently involves large-scale hypothesis testing,...
research
10/26/2020

Dynamic Algorithms for Online Multiple Testing

We demonstrate new algorithms for online multiple testing that provably ...
research
10/11/2019

The Power of Batching in Multiple Hypothesis Testing

One important partition of algorithms for controlling the false discover...
research
08/26/2023

An exhaustive ADDIS principle for online FWER control

In this paper we consider online multiple testing with familywise error ...
research
06/16/2017

A framework for Multi-A(rmed)/B(andit) testing with online FDR control

We propose an alternative framework to existing setups for controlling f...

Please sign up or login with your details

Forgot password? Click here to reset