POPDx: An Automated Framework for Patient Phenotyping across 392,246 Individuals in the UK Biobank Study

08/23/2022
by   Lu Yang, et al.
5

Objective For the UK Biobank standardized phenotype codes are associated with patients who have been hospitalized but are missing for many patients who have been treated exclusively in an outpatient setting. We describe a method for phenotype recognition that imputes phenotype codes for all UK Biobank participants. Materials and Methods POPDx (Population-based Objective Phenotyping by Deep Extrapolation) is a bilinear machine learning framework for simultaneously estimating the probabilities of 1,538 phenotype codes. We extracted phenotypic and health-related information of 392,246 individuals from the UK Biobank for POPDx development and evaluation. A total of 12,803 ICD-10 diagnosis codes of the patients were converted to 1,538 Phecodes as gold standard labels. The POPDx framework was evaluated and compared to other available methods on automated multi-phenotype recognition. Results POPDx can predict phenotypes that are rare or even unobserved in training. We demonstrate substantial improvement of automated multi-phenotype recognition across 22 disease categories, and its application in identifying key epidemiological features associated with each phenotype. Conclusions POPDx helps provide well-defined cohorts for downstream studies. It is a general purpose method that can be applied to other biobanks with diverse but incomplete data.

READ FULL TEXT

page 6

page 9

page 10

page 30

page 31

page 33

page 34

page 39

research
11/28/2018

Disease phenotyping using deep learning: A diabetes case study

Characterization of a patient clinical phenotype is central to biomedica...
research
04/30/2011

An Automated Size Recognition Technique for Acetabular Implant in Total Hip Replacement

Preoperative templating in Total Hip Replacement (THR) is a method to es...
research
10/27/2020

Optimisation des parcours patients pour lutter contre l'errance de diagnostic des patients atteints de maladies rares

A patient suffering from a rare disease in France has to wait an average...
research
09/06/2016

A Bootstrap Machine Learning Approach to Identify Rare Disease Patients from Electronic Health Records

Rare diseases are very difficult to identify among large number of other...
research
06/24/2020

Diagnosis Prevalence vs. Efficacy in Machine-learning Based Diagnostic Decision Support

Many recent studies use machine learning to predict a small number of IC...
research
08/14/2023

Improving ICD-based semantic similarity by accounting for varying degrees of comorbidity

Finding similar patients is a common objective in precision medicine, fa...
research
09/19/2021

Co-occurrence of medical conditions: Exposing patterns through probabilistic topic modeling of SNOMED codes

Patients associated with multiple co-occurring health conditions often f...

Please sign up or login with your details

Forgot password? Click here to reset