Log In Sign Up

Modeling Heterogeneity and Missing Data of Multiple Longitudinal Outcomes in Electronic Health Records

by   Rebecca Anthopolos, et al.

In electronic health records (EHRs), latent subgroups of patients may exhibit distinctive patterning in their longitudinal health trajectories. For such data, growth mixture models (GMMs) enable classifying patients into different latent classes based on individual trajectories and hypothesized risk factors. However, the application of GMMs is hindered by the special missing data problem in EHRs, which manifests two patient-led missing data processes: the visit process and the response process for an EHR variable conditional on a patient visiting the clinic. If either process is associated with the process generating the longitudinal outcomes, then valid inferences require accounting for a nonignorable missing data mechanism. We propose a Bayesian shared parameter model that links GMMs of multiple longitudinal health outcomes, the visit process, and the response process of each outcome given a visit using a discrete latent class variable. Our focus is on multiple longitudinal health outcomes for which there can be a clinically prescribed visit schedule. We demonstrate our model in EHR measurements on early childhood weight and height z-scores. Using data simulations, we illustrate the statistical properties of our method with respect to subgroup-specific or marginal inferences. We built the R package EHRMiss for model fitting, selection, and checking.


page 22

page 28


A Bayesian Approach to Modelling Longitudinal Data in Electronic Health Records

Analyzing electronic health records (EHR) poses significant challenges b...

Outcome identification in electronic health records using predictions from an enriched Dirichlet process mixture

We propose a novel semiparametric model for the joint distribution of a ...

Handling Non-ignorably Missing Features in Electronic Health Records Data Using Importance-Weighted Autoencoders

Electronic Health Records (EHRs) are commonly used to investigate relati...

Interpretable machine learning for high-dimensional trajectories of aging health

We have built a computational model for individual aging trajectories of...

The Autoregressive Structural Model for analyzing longitudinal health data of an aging population in China

We seek to elucidate the impact of social activity, physical activity an...

Enhancing the prediction of disease outcomes using electronic health records and pretrained deep learning models

Question: Can an encoder-decoder architecture pretrained on a large data...

Deep Modeling of Growth Trajectories for Longitudinal Prediction of Missing Infant Cortical Surfaces

Charting cortical growth trajectories is of paramount importance for und...