Heavy-tailed distribution for combining dependent p-values with asymptotic robustness

03/24/2021 ∙ by Yusi Fang, et al. ∙ 0

The issue of combining individual p-values to aggregate multiple small effects is prevalent in many scientific investigations and is a long-standing statistical topic. Many classical methods are designed for combining independent and frequent signals in a traditional meta-analysis sense using the sum of transformed p-values with the transformation of light-tailed distributions, in which Fisher's method and Stouffer's method are the most well-known. Since the early 2000, advances in big data promoted methods to aggregate independent, sparse and weak signals, such as the renowned higher criticism and Berk-Jones tests. Recently, Liu and Xie(2020) and Wilson(2019) independently proposed Cauchy and harmonic mean combination tests to robustly combine p-values under "arbitrary" dependency structure, where a notable application is to combine p-values from a set of often correlated SNPs in genome-wide association studies. The proposed tests are the transformation of heavy-tailed distributions for improved power with the sparse signal. It calls for a natural question to investigate heavy-tailed distribution transformation, to understand the connection among existing methods, and to explore the conditions for a method to possess robustness to dependency. In this paper, we investigate the regularly varying distribution, which is a rich family of heavy-tailed distribution and includes Pareto distribution as a special case. We show that only an equivalent class of Cauchy and harmonic mean tests have sufficient robustness to dependency in a practical sense. We also show an issue caused by large negative penalty in the Cauchy method and propose a simple, yet practical modification. Finally, we present simulations and apply to a neuroticism GWAS application to verify the discovered theoretical insights and provide practical guidance.



There are no comments yet.


page 15

page 29

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.