Accurate and Efficient Estimation of Small P-values with the Cross-Entropy Method: Applications in Genomic Data Analysis

03/09/2018
by   Yang Shi, et al.
0

Small p-values are often required to be accurately estimated in large scale genomic studies for the adjustment of multiple hypothesis tests and the ranking of genomic features based on their statistical significance. For those complicated test statistics whose cumulative distribution functions are analytical intractable, existing methods usually do not work well with small p-values due to lack of accuracy or computational restrictions. We propose a general approach for accurately and efficiently calculating small p-values for a broad range of complicated test statistics based on the principle of the cross-entropy method and Markov chain Monte Carlo sampling techniques.We evaluate the performance of the proposed algorithm through simulations and demonstrate its application to three real examples in genomic studies. The results show that our approach can accurately evaluate small to extremely small p-values (e.g. 10^-6 to 10^-100). The proposed algorithm is helpful to the improvement of existing test procedures and the development of new test procedures in genomic studies.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/12/2021

Rare Events via Cross-Entropy Population Monte Carlo

We present a Cross-Entropy based population Monte Carlo algorithm. This ...
research
11/02/2020

Cross-entropy method in application to SIRC model

The study considers the usage of a probabilistic optimization method cal...
research
07/13/2021

Cauchy Combination Test for Sparse Signals

Aggregating multiple effects is often encountered in large-scale data an...
research
08/13/2020

Sequential Monte Carlo for Sampling Balanced and Compact Redistricting Plans

Random sampling of graph partitions under constraints has become a popul...
research
07/08/2022

Semi-supervised standardized detection of extrasolar planets

The detection of small exoplanets with the radial velocity (RV) techniqu...
research
01/16/2023

Regional Pooling in Extreme Event Attribution Studies: an Approach Based on Multiple Statistical Testing

Statistical methods are proposed to select homogeneous locations when an...
research
05/02/2020

An Extensible, Scalable Spark Platform for Alignment-free Genomic Analysis – Version 2

Motivation: Alignment-free distance and similarity functions (AF functio...

Please sign up or login with your details

Forgot password? Click here to reset