A stable and adaptive polygenic signal detection method based on repeated sample splitting

08/06/2020
by   Yanyan Zhao, et al.
0

Using polygenic risk score for trait association analyses and disease prediction are paramount for genetic studies of complex traits. Valid inference relies on sample splitting, or more recently external data, to obtain a set of potentially associated genetic variants, along with their weights, for polygenic risk score construction. The use of external data has been popular, but recent work increasingly calls its use into question due to adverse effects of potential data heterogeneity between different samples. Our study here adheres to the original sampling-splitting principle but does so, repeatedly, to increase stability of our inference. To accommodate different polygenic structures, we develop an adaptive test for generalized linear models. We provide the asymptotic null distributions of the proposed test for both fixed and diverging number of variants. We also show the asymptotic properties of the proposed test under local alternatives, providing insights on why power gain attributed to variable selection and weighting can compensate for efficiency loss due to sample splitting. We support our analytical findings through extensive simulation studies and an application.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/08/2021

A Unified Approach to Robust Inference for Genetic Covariance

Genome-wide association studies (GWAS) have identified thousands of gene...
research
10/22/2019

Integrated Quantile RAnk Test (iQRAT) for gene-level associations in sequencing studies

Testing gene-based associations is the fundamental approach to identify ...
research
08/11/2019

Sample Splitting as an M-Estimator with Application to Physical Activity Scoring

Sample splitting is widely used in statistical applications, including c...
research
03/02/2021

Significance tests of feature relevance for a blackbox learner

An exciting recent development is the uptake of deep learning in many sc...
research
10/25/2019

Boosting heritability: estimating the genetic component of phenotypic variation with multiple sample splitting

Heritability is a central measure in genetics quantifying how much of th...
research
01/07/2021

Addressing patient heterogeneity in disease predictive model development

This paper addresses patient heterogeneity associated with prediction pr...

Please sign up or login with your details

Forgot password? Click here to reset