Targeting Underrepresented Populations in Precision Medicine: A Federated Transfer Learning Approach

08/27/2021
by   Sai Li, et al.
0

The limited representation of minorities and disadvantaged populations in large-scale clinical and genomics research has become a barrier to translating precision medicine research into practice. Due to heterogeneity across populations, risk prediction models are often found to be underperformed in these underrepresented populations, and therefore may further exacerbate known health disparities. In this paper, we propose a two-way data integration strategy that integrates heterogeneous data from diverse populations and from multiple healthcare institutions via a federated transfer learning approach. The proposed method can handle the challenging setting where sample sizes from different populations are highly unbalanced. With only a small number of communications across participating sites, the proposed method can achieve performance comparable to the pooled analysis where individual-level data are directly pooled together. We show that the proposed method improves the estimation and prediction accuracy in underrepresented populations, and reduces the gap of model performance across populations. Our theoretical analysis reveals how estimation accuracy is influenced by communication budgets, privacy restrictions, and heterogeneity across populations. We demonstrate the feasibility and validity of our methods through numerical experiments and a real application to a multi-center study, in which we construct polygenic risk prediction models for Type II diabetes in AA population.

READ FULL TEXT

page 18

page 19

page 21

research
05/10/2022

Improving genetic risk prediction across diverse population by disentangling ancestry representations

Risk prediction models using genetic data have seen increasing traction ...
research
10/12/2022

Bregman Divergence-Based Data Integration with Application to Polygenic Risk Score (PRS) Heterogeneity Adjustment

Polygenic risk scores (PRS) have recently received much attention for ge...
research
02/22/2023

Incorporating External Risk Information with the Cox Model under Population Heterogeneity: Applications to Trans-Ancestry Polygenic Hazard Scores

Polygenic hazard score (PHS) models designed for European ancestry (EUR)...
research
12/10/2020

Assessment of the impact of EHR heterogeneity for clinical research through a case study of silent brain infarction

Background: The rapid adoption of electronic health records (EHRs) holds...
research
09/18/2023

Multi-dimensional domain generalization with low-rank structures

In conventional statistical and machine learning methods, it is typicall...
research
12/23/2022

Sufficient Dimension Reduction for Populations with Structured Heterogeneity

A key challenge in building effective regression models for large and di...
research
11/19/2015

The Kernel Two-Sample Test for Brain Networks

In clinical and neuroscientific studies, systematic differences between ...

Please sign up or login with your details

Forgot password? Click here to reset