Optimal False Discovery Rate Control for Large Scale Multiple Testing with Auxiliary Information

03/29/2021
by   HongYuan Cao, et al.
0

Large-scale multiple testing is a fundamental problem in high dimensional statistical inference. It is increasingly common that various types of auxiliary information, reflecting the structural relationship among the hypotheses, are available. Exploiting such auxiliary information can boost statistical power. To this end, we propose a framework based on a two-group mixture model with varying probabilities of being null for different hypotheses a priori, where a shape-constrained relationship is imposed between the auxiliary information and the prior probabilities of being null. An optimal rejection rule is designed to maximize the expected number of true positives when average false discovery rate is controlled. Focusing on the ordered structure, we develop a robust EM algorithm to estimate the prior probabilities of being null and the distribution of p-values under the alternative hypothesis simultaneously. We show that the proposed method has better power than state-of-the-art competitors while controlling the false discovery rate, both empirically and theoretically. Extensive simulations demonstrate the advantage of the proposed method. Datasets from genome-wide association studies are used to illustrate the new methodology.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 38

page 42

03/29/2016

Online Rules for Control of False Discovery Rate and False Discovery Exceedance

Multiple hypothesis testing is a core problem in statistical inference a...
02/15/2021

Controlling False Discovery Rates Using Null Bootstrapping

We consider controlling the false discovery rate for many tests with unk...
04/29/2021

Querying multiple sets of p-values through composed hypothesis testing

Motivation: Combining the results of different experiments to exhibit co...
05/11/2018

False Discovery Rate Control Under Reduced Precision Computation

The mitigation of false positives is an important issue when conducting ...
08/13/2019

False Discovery Rate for Functional Data

Since Benjamini and Hochberg introduced false discovery rate (FDR) in th...
10/13/2019

Five Shades of Grey: Phase Transitions in High-dimensional Multiple Testing

We are motivated by marginal screenings of categorical variables, and st...
12/12/2019

Exploratory data analysis for large-scale multiple testing problems and its application in gene expression studies

In large scale multiple testing problems, a two-class empirical Bayes ap...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.