A statistical framework for GWAS of high dimensional phenotypes using summary statistics, with application to metabolite GWAS

03/17/2023
by   Weiqiong Huang, et al.
0

The recent explosion of genetic and high dimensional biobank and 'omic' data has provided researchers with the opportunity to investigate the shared genetic origin (pleiotropy) of hundreds to thousands of related phenotypes. However, existing methods for multi-phenotype genome-wide association studies (GWAS) do not model pleiotropy, are only applicable to a small number of phenotypes, or provide no way to perform inference. To add further complication, raw genetic and phenotype data are rarely observed, meaning analyses must be performed on GWAS summary statistics whose statistical properties in high dimensions are poorly understood. We therefore developed a novel model, theoretical framework, and set of methods to perform Bayesian inference in GWAS of high dimensional phenotypes using summary statistics that explicitly model pleiotropy, beget fast computation, and facilitate the use of biologically informed priors. We demonstrate the utility of our procedure by applying it to metabolite GWAS, where we develop new nonparametric priors for genetic effects on metabolite levels that use known metabolic pathway information and foster interpretable inference at the pathway level.

READ FULL TEXT

page 11

page 30

research
06/27/2023

High-dimensional statistical inference for linkage disequilibrium score regression and its cross-ancestry extensions

Linkage disequilibrium score regression (LDSC) has emerged as an essenti...
research
03/19/2022

Measuring the severity of multi-collinearity in high dimensions

Multi-collinearity is a wide-spread phenomenon in modern statistical app...
research
06/24/2020

ANOVA exemplars for understanding data drift

The distributions underlying complex datasets, such as images, text or t...
research
01/24/2019

Causal Mediation Analysis Leveraging Multiple Types of Summary Statistics Data

Summary statistics of genome-wide association studies (GWAS) teach causa...
research
09/13/2023

Tackling the dimensions in imaging genetics with CLUB-PLS

A major challenge in imaging genetics and similar fields is to link high...
research
01/31/2020

Convolutional Neural Networks as Summary Statistics for Approximate Bayesian Computation

Approximate Bayesian Computation is widely used in systems biology for i...
research
02/14/2013

Locally epistatic genomic relationship matrices for genomic association, prediction and selection

As the amount and complexity of genetic information increases it is nece...

Please sign up or login with your details

Forgot password? Click here to reset