A Bayes Factor Approach with Informative Prior for Rare Genetic Variant Analysis from Next Generation Sequencing Data

02/20/2020
by   Jingxiong Xu, et al.
0

The discovery of rare genetic variants through Next Generation Sequencing is a very challenging issue in the field of human genetics. We propose a novel region-based statistical approach based on a Bayes Factor (BF) to assess evidence of association between a set of rare variants (RVs) located on the same genomic region and a disease outcome in the context of case-control design. Marginal likelihoods are computed under the null and alternative hypotheses assuming a binomial distribution for the RV count in the region and a beta or mixture of Dirac and beta prior distribution for the probability of RV. We derive the theoretical null distribution of the BF under our prior setting and show that a Bayesian control of the False Discovery Rate (BFDR) can be obtained for genome-wide inference. Informative priors are introduced using prior evidence of association from a Kolmogorov-Smirnov test statistic. We use our simulation program, sim1000G, to generate RV data similar to the 1,000 genomes sequencing project. Our simulation studies showed that the new BF statistic outperforms standard methods (SKAT, SKAT-O, Burden test) in case-control studies with moderate sample sizes and is equivalent to them under large sample size scenarios. Our real data application to a lung cancer case-control study found enrichment for RVs in known and novel cancer genes. It also suggests that using the BF with informative prior improves the overall gene discovery compared to the BF with non-informative prior.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/29/2019

A Simple Yet Efficient Parametric Method of Local False Discovery Rate Estimation Designed for Genome-Wide Association Data Analysis

In genome-wide association studies (GWAS), hundreds of thousands of gene...
research
08/07/2023

Nonparametric Bayes multiresolution testing for high-dimensional rare events

In a variety of application areas, there is interest in assessing eviden...
research
10/08/2021

Saddlepoint approximations in binary genome-wide association studies

We investigate saddlepoint approximations applied to the score test stat...
research
01/18/2018

Variance Components Genetic Association Test for Zero-inflated Count Outcomes

Commonly in biomedical research, studies collect data in which an outcom...
research
06/05/2019

DOT: Gene-set analysis by combining decorrelated association statistics

Historically, the majority of statistical association methods have been ...
research
12/03/2021

Bayesian nonparametric strategies for power maximization in rare variants association studies

Rare variants are hypothesized to be largely responsible for heritabilit...

Please sign up or login with your details

Forgot password? Click here to reset