Set-Based Tests for Genetic Association Using the Generalized Berk-Jones Statistic

10/06/2017
by   Ryan Sun, et al.
0

Studying the effects of groups of Single Nucleotide Polymorphisms (SNPs), as in a gene, genetic pathway, or network, can provide novel insight into complex diseases, above that which can be gleaned from studying SNPs individually. Common challenges in set-based genetic association testing include weak effect sizes, correlation between SNPs in a SNP-set, and scarcity of signals, with single-SNP effects often ranging from extremely sparse to moderately sparse in number. Motivated by these challenges, we propose the Generalized Berk-Jones (GBJ) test for the association between a SNP-set and outcome. The GBJ extends the Berk-Jones (BJ) statistic by accounting for correlation among SNPs, and it provides advantages over the Generalized Higher Criticism (GHC) test when signals in a SNP-set are moderately sparse. We also provide an analytic p-value calculation procedure for SNP-sets of any finite size. Using this p-value calculation, we illustrate that the rejection region for GBJ can be described as a compromise of those for BJ and GHC. We develop an omnibus statistic as well, and we show that this omnibus test is robust to the degree of signal sparsity. An additional advantage of our method is the ability to conduct inference using individual SNP summary statistics from a genome-wide association study. We evaluate the finite sample performance of the GBJ though simulation studies and application to gene-level association analysis of breast cancer risk.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/05/2019

DOT: Gene-set analysis by combining decorrelated association statistics

Historically, the majority of statistical association methods have been ...
research
08/27/2018

Cauchy combination test: a powerful test with analytic p-value calculation under arbitrary dependency structures

Combining individual p-values to aggregate multiple small effects has a ...
research
01/24/2018

Optimal Estimation of Simultaneous Signals Using Absolute Inner Product with Applications to Integrative Genomics

Integrating the summary statistics from genome-wide association study (G...
research
11/02/2018

Brawn and Brains: a Robust and Powerful approach to X-inclusive Whole-genome Association Studies

X-chromosome is often excluded from whole-genome association studies due...
research
05/07/2021

SEAGLE: A Scalable Exact Algorithm for Large-Scale Set-Based GxE Tests in Biobank Data

The explosion of biobank data offers immediate opportunities for gene-en...
research
06/10/2018

Generalized Goodness-Of-Fit Tests for Correlated Data

This paper concerns the problem of applying the generalized goodness-of-...

Please sign up or login with your details

Forgot password? Click here to reset