High-dimensional statistical inference for linkage disequilibrium score regression and its cross-ancestry extensions

06/27/2023
by   Fei Xue, et al.
0

Linkage disequilibrium score regression (LDSC) has emerged as an essential tool for genetic and genomic analyses of complex traits, utilizing high-dimensional data derived from genome-wide association studies (GWAS). LDSC computes the linkage disequilibrium (LD) scores using an external reference panel, and integrates the LD scores with only summary data from the original GWAS. In this paper, we investigate LDSC within a fixed-effect data integration framework, underscoring its ability to merge multi-source GWAS data and reference panels. In particular, we take account of the genome-wide dependence among the high-dimensional GWAS summary statistics, along with the block-diagonal dependence pattern in estimated LD scores. Our analysis uncovers several key factors of both the original GWAS and reference panel datasets that determine the performance of LDSC. We show that it is relatively feasible for LDSC-based estimators to achieve asymptotic normality when applied to genome-wide genetic variants (e.g., in genetic variance and covariance estimation), whereas it becomes considerably challenging when we focus on a much smaller subset of genetic variants (e.g., in partitioned heritability analysis). Moreover, by modeling the disparities in LD patterns across different populations, we unveil that LDSC can be expanded to conduct cross-ancestry analyses using data from distinct global populations (such as European and Asian). We validate our theoretical findings through extensive numerical evaluations using real genetic data from the UK Biobank study.

READ FULL TEXT
research
03/22/2022

On block-wise and reference panel-based estimators for genetic data prediction in high dimensions

Genetic prediction of complex traits and diseases has attracted enormous...
research
03/17/2023

A statistical framework for GWAS of high dimensional phenotypes using summary statistics, with application to metabolite GWAS

The recent explosion of genetic and high dimensional biobank and 'omic' ...
research
09/13/2023

Tackling the dimensions in imaging genetics with CLUB-PLS

A major challenge in imaging genetics and similar fields is to link high...
research
12/23/2022

A Population-Aware Retrospective Regression to Detect Genome-Wide Variants with Sex Difference in Allele Frequency

Sex difference in allele frequency is an emerging topic that is critical...
research
10/21/2022

Comparison of REML methods for the study of phenome-wide genetic variation

It is now well documented that genetic covariance between functionally r...
research
03/19/2022

Measuring the severity of multi-collinearity in high dimensions

Multi-collinearity is a wide-spread phenomenon in modern statistical app...
research
01/09/2019

The Mahalanobis kernel for heritability estimation in genome-wide association studies: fixed-effects and random-effects methods

Linear mixed models (LMMs) are widely used for heritability estimation i...

Please sign up or login with your details

Forgot password? Click here to reset