Improving genetic risk prediction across diverse population by disentangling ancestry representations

05/10/2022
by   Prashnna K Gyawali, et al.
8

Risk prediction models using genetic data have seen increasing traction in genomics. However, most of the polygenic risk models were developed using data from participants with similar (mostly European) ancestry. This can lead to biases in the risk predictors resulting in poor generalization when applied to minority populations and admixed individuals such as African Americans. To address this bias, largely due to the prediction models being confounded by the underlying population structure, we propose a novel deep-learning framework that leverages data from diverse population and disentangles ancestry from the phenotype-relevant information in its representation. The ancestry disentangled representation can be used to build risk predictors that perform better across minority populations. We applied the proposed method to the analysis of Alzheimer's disease genetics. Comparing with standard linear and nonlinear risk prediction methods, the proposed method substantially improves risk prediction in minority populations, particularly for admixed individuals.

READ FULL TEXT

page 6

page 7

page 9

research
08/27/2021

Targeting Underrepresented Populations in Precision Medicine: A Federated Transfer Learning Approach

The limited representation of minorities and disadvantaged populations i...
research
11/19/2021

SNPs Filtered by Allele Frequency Improve the Prediction of Hypertension Subtypes

Hypertension is the leading global cause of cardiovascular disease and p...
research
02/22/2023

Incorporating External Risk Information with the Cox Model under Population Heterogeneity: Applications to Trans-Ancestry Polygenic Hazard Scores

Polygenic hazard score (PHS) models designed for European ancestry (EUR)...
research
12/23/2022

A Population-Aware Retrospective Regression to Detect Genome-Wide Variants with Sex Difference in Allele Frequency

Sex difference in allele frequency is an emerging topic that is critical...
research
10/12/2022

Bregman Divergence-Based Data Integration with Application to Polygenic Risk Score (PRS) Heterogeneity Adjustment

Polygenic risk scores (PRS) have recently received much attention for ge...
research
07/31/2017

Developing Knowledge-enhanced Chronic Disease Risk Prediction Models from Regional EHR Repositories

Precision medicine requires the precision disease risk prediction models...
research
04/07/2023

A roadmap to fair and trustworthy prediction model validation in healthcare

A prediction model is most useful if it generalizes beyond the developme...

Please sign up or login with your details

Forgot password? Click here to reset