LATTE: Label-efficient Incident Phenotyping from Longitudinal Electronic Health Records

05/19/2023
by   Jun Wen, et al.
0

Electronic health record (EHR) data are increasingly used to support real-world evidence (RWE) studies. Yet its ability to generate reliable RWE is limited by the lack of readily available precise information on the timing of clinical events such as the onset time of heart failure. We propose a LAbel-efficienT incidenT phEnotyping (LATTE) algorithm to accurately annotate the timing of clinical events from longitudinal EHR data. By leveraging the pre-trained semantic embedding vectors from large-scale EHR data as prior knowledge, LATTE selects predictive EHR features in a concept re-weighting module by mining their relationship to the target event and compresses their information into longitudinal visit embeddings through a visit attention learning network. LATTE employs a recurrent neural network to capture the sequential dependency between the target event and visit embeddings before/after it. To improve label efficiency, LATTE constructs highly informative longitudinal silver-standard labels from large-scale unlabeled patients to perform unsupervised pre-training and semi-supervised joint training. Finally, LATTE enhances cross-site portability via contrastive representation learning. LATTE is evaluated on three analyses: the onset of type-2 diabetes, heart failure, and the onset and relapses of multiple sclerosis. We use various evaluation metrics present in the literature including the ABC_gain, the proportion of reduction in the area between the observed event indicator and the predicted cumulative incidences in reference to the prediction per incident prevalence. LATTE consistently achieves substantial improvement over benchmark methods such as SAMGEP and RETAIN in all settings.

READ FULL TEXT
research
10/18/2021

Semi-supervised Approach to Event Time Annotation Using Longitudinal Electronic Health Records

Large clinical datasets derived from insurance claims and electronic hea...
research
07/26/2021

Uncertainty-Aware Time-to-Event Prediction using Deep Kernel Accelerated Failure Time Models

Recurrent neural network based solutions are increasingly being used in ...
research
02/11/2016

Medical Concept Representation Learning from Electronic Health Records and its Application on Heart Failure Prediction

Objective: To transform heterogeneous clinical data from electronic heal...
research
04/25/2020

Recurrent Events Analysis With Data Collected at Informative Clinical Visits in Electronic Health Records

Although increasingly used as a data resource for assembling cohorts, el...
research
12/24/2020

Deep Semi-Supervised Embedded Clustering (DSEC) for Stratification of Heart Failure Patients

Determining phenotypes of diseases can have considerable benefits for in...
research
11/23/2018

Application of Clinical Concept Embeddings for Heart Failure Prediction in UK EHR data

Electronic health records (EHR) are increasingly being used for construc...
research
03/15/2020

ADW: Blockchain-enabled Small-scale Farm Digitization

Farm records hold the static, temporal, and longitudinal details of the ...

Please sign up or login with your details

Forgot password? Click here to reset