High-dimensional multi-trait GWAS by reverse prediction of genotypes

10/29/2021
by   Muhammad Ammar Malik, et al.
0

Multi-trait genome-wide association studies (GWAS) use multi-variate statistical methods to identify associations between genetic variants and multiple correlated traits simultaneously, and have higher statistical power than independent univariate analysis of traits. Reverse regression, where genotypes of genetic variants are regressed on multiple traits simultaneously, has emerged as a promising approach to perform multi-trait GWAS in high-dimensional settings where the number of traits exceeds the number of samples. We extended this approach and analyzed different machine learning methods (ridge regression, random forests and support vector machines)for reverse regression in multi-trait GWAS, using genotypes, gene expression data and ground-truth transcriptional regulatory networks from the DREAM5 SysGen Challenge and from a cross between two yeast strains to evaluate methods. We found that genotype prediction performance, in terms of root mean squared error (RMSE), allowed to distinguish between genomic regions with high and low transcriptional activity. Moreover, model feature coefficients correlated with the strength of association between variants and individual traits, and were predictive of true trans-eQTL target genes, with complementary findings across methods.

READ FULL TEXT

page 4

page 5

research
03/31/2022

rfPhen2Gen: A machine learning based association study of brain imaging phenotypes to genotypes

Imaging genetic studies aim to find associations between genetic variant...
research
12/08/2015

Nonparametric Reduced-Rank Regression for Multi-SNP, Multi-Trait Association Mapping

Genome-wide association studies have proven to be essential for understa...
research
11/13/2008

A Multivariate Regression Approach to Association Analysis of Quantitative Trait Network

Many complex disease syndromes such as asthma consist of a large number ...
research
07/24/2018

A decision theoretic approach to model evaluation in computational drug discovery

Artificial intelligence, trained via machine learning or computational s...
research
10/27/2014

A General Statistic Framework for Genome-based Disease Risk Prediction

Advances of modern sensing and sequencing technologies generate a deluge...
research
08/12/2021

Understanding the population structure correction regression

Although genome-wide association studies (GWAS) on complex traits have a...

Please sign up or login with your details

Forgot password? Click here to reset