Joint Analysis of Individual-level and Summary-level GWAS Data by Leveraging Pleiotropy

04/30/2018
by   Mingwei Dai, et al.
0

A large number of recent genome-wide association studies (GWASs) for complex phenotypes confirm the early conjecture for polygenicity, suggesting the presence of large number of variants with only tiny or moderate effects. However, due to the limited sample size of a single GWAS, many associated genetic variants are too weak to achieve the genome-wide significance. These undiscovered variants further limit the prediction capability of GWAS. Restricted access to the individual-level data and the increasing availability of the published GWAS results motivate the development of methods integrating both the individual-level and summary-level data. How to build the connection between the individual-level and summary-level data determines the efficiency of using the existing abundant summary-level resources with limited individual-level data, and this issue inspires more efforts in the existing area. In this study, we propose a novel statistical approach, LEP, which provides a novel way of modeling the connection between the individual-level data and summary-level data. LEP integrates both types of data by LEveraing Pleiotropy to increase the statistical power of risk variants identification and the accuracy of risk prediction. The algorithm for parameter estimation is developed to handle genome-wide-scale data. Through comprehensive simulation studies, we demonstrated the advantages of LEP over the existing methods. We further applied LEP to perform integrative analysis of Crohn's disease from WTCCC and summary statistics from GWAS of some other diseases, such as Type 1 diabetes, Ulcerative colitis and Primary biliary cirrhosis. LEP was able to significantly increase the statistical power of identifying risk variants and improve the risk prediction accuracy from 63.39% (± 0.58%) to 68.33% (± 0.32%) using about 195,000 variants.

READ FULL TEXT

page 8

page 24

page 25

page 27

page 28

page 29

research
04/20/2022

An Adaptive and Robust Method for Multi-trait Analysis of Genome-wide Association Studies Using Summary Statistics

Genome-wide association studies (GWAS) have identified thousands of gene...
research
05/03/2018

REMI: Regression with marginal information and its application in genome-wide association studies

In this study, we consider the problem of variable selection and estimat...
research
01/24/2019

Causal Mediation Analysis Leveraging Multiple Types of Summary Statistics Data

Summary statistics of genome-wide association studies (GWAS) teach causa...
research
01/28/2021

A Kernel-Based Neural Network for High-dimensional Genetic Risk Prediction Analysis

Risk prediction capitalizing on emerging human genome findings holds gre...
research
04/08/2018

eQTL Mapping via Effective SNP Ranking and Screening

Genome-wide eQTL mapping explores the relationship between gene expressi...
research
11/22/2019

Cross-trait prediction accuracy of high-dimensional ridge-type estimators in genome-wide association studies

Marginal association summary statistics have attracted great attention i...
research
01/24/2018

Optimal Estimation of Simultaneous Signals Using Absolute Inner Product with Applications to Integrative Genomics

Integrating the summary statistics from genome-wide association study (G...

Please sign up or login with your details

Forgot password? Click here to reset