Exploratory data analysis for large-scale multiple testing problems and its application in gene expression studies

12/12/2019
by   Paramita Chakraborty, et al.
0

In large scale multiple testing problems, a two-class empirical Bayes approach can be used to control the false discovery rate (Fdr) for the entire array of hypotheses under study. A sample splitting step is incorporated to modify that approach where one part of the data is used for model fitting and the other part for detecting the significant cases by a screening technique featuring the empirical Bayes mode of Fdr control. Cases with high detection frequency across repeated random sample splits are considered true discoveries. A critical detection frequency is set to control the overall false discovery rate. The proposed method helps to balance out unwanted sources of variation and addresses potential statistical overfitting of the core empirical model by cross-validation through resampling. Further, concurrent detection frequencies are used to provide visual tools to explore the inter-relationship between significant cases. The methodology is illustrated using a microarray data set, RNA-sequencing data set, and several simulation studies. A power analysis is presented to understand the efficiency of the proposed method.

READ FULL TEXT

page 17

page 40

research
11/06/2021

Empirical Bayes Control of the False Discovery Exceedance

In sparse large-scale testing problems where the false discovery proport...
research
08/20/2019

Dimension constraints improve hypothesis testing for large-scale, graph-associated, brain-image data

For large-scale testing with graph-associated data, we present an empiri...
research
08/29/2018

On spike and slab empirical Bayes multiple testing

This paper explores a connection between empirical Bayes posterior distr...
research
09/13/2022

Empirical Bayes Multistage Testing for Large-Scale Experiments

Modern application of A/B tests is challenging due to its large scale in...
research
08/24/2021

A Generalized Knockoff Procedure for FDR Control in Structural Change Detection

Controlling false discovery rate (FDR) is crucial for variable selection...
research
08/13/2019

False Discovery Rate for Functional Data

Since Benjamini and Hochberg introduced false discovery rate (FDR) in th...
research
04/08/2018

eQTL Mapping via Effective SNP Ranking and Screening

Genome-wide eQTL mapping explores the relationship between gene expressi...

Please sign up or login with your details

Forgot password? Click here to reset