Smaller p-values in genomics studies using distilled historical information

04/16/2020
by   Jordan G. Bryan, et al.
0

Medical research institutions have generated massive amounts of biological data by genetically profiling hundreds of cancer cell lines. In parallel, academic biology labs have conducted genetic screens on small numbers of cancer cell lines under custom experimental conditions. In order to share information between these two approaches to scientific discovery, this article proposes a "frequentist assisted by Bayes" (FAB) procedure for hypothesis testing that allows historical information from massive genomics datasets to increase the power of hypothesis tests in specialized studies. The exchange of information takes place through a novel probability model for multimodal genomics data, which distills historical information pertaining to cancer cell lines and genes across a wide variety of experimental contexts. If the relevance of the historical information for a given study is high, then the resulting FAB tests can be more powerful than the corresponding classical tests. If the relevance is low, then the FAB tests yield as many discoveries as the classical tests. Simulations and practical investigations demonstrate that the FAB testing procedure can increase the number of effects discovered in genomics studies while still maintaining strict control of type I error and false discovery rates.

READ FULL TEXT
research
08/31/2022

Two-stage Hypothesis Tests for Variable Interactions with FDR Control

In many scenarios such as genome-wide association studies where dependen...
research
11/22/2021

Using prior information to boost power in correlation structure support recovery

Hypothesis testing of structure in correlation and covariance matrices i...
research
05/24/2023

Dynamic Borrowing Method for Historical Information Using a Frequentist Approach for Hybrid Control Design

Information borrowing from historical data is gaining attention in clini...
research
07/03/2023

Data-Driven Information Extraction and Enrichment of Molecular Profiling Data for Cancer Cell Lines

With the proliferation of research means and computational methodologies...
research
05/10/2020

Testing Mediation Effects Using Logic of Boolean Matrices

Mediation analysis is becoming an increasingly important tool in scienti...
research
11/14/2021

Invariant Risk Minimisation for Cross-Organism Inference: Substituting Mouse Data for Human Data in Human Risk Factor Discovery

Human medical data can be challenging to obtain due to data privacy conc...
research
01/25/2022

NAPA: Neighborhood-Assisted and Posterior-Adjusted Two-sample Inference

Two-sample multiple testing problems of sparse spatial data are frequent...

Please sign up or login with your details

Forgot password? Click here to reset