Handling Non-ignorably Missing Features in Electronic Health Records Data Using Importance-Weighted Autoencoders

01/18/2021
by   David K. Lim, et al.
0

Electronic Health Records (EHRs) are commonly used to investigate relationships between patient health information and outcomes. Deep learning methods are emerging as powerful tools to learn such relationships, given the characteristic high dimension and large sample size of EHR datasets. The Physionet 2012 Challenge involves an EHR dataset pertaining to 12,000 ICU patients, where researchers investigated the relationships between clinical measurements, and in-hospital mortality. However, the prevalence and complexity of missing data in the Physionet data present significant challenges for the application of deep learning methods, such as Variational Autoencoders (VAEs). Although a rich literature exists regarding the treatment of missing data in traditional statistical models, it is unclear how this extends to deep learning architectures. To address these issues, we propose a novel extension of VAEs called Importance-Weighted Autoencoders (IWAEs) to flexibly handle Missing Not At Random (MNAR) patterns in the Physionet data. Our proposed method models the missingness mechanism using an embedded neural network, eliminating the need to specify the exact form of the missingness mechanism a priori. We show that the use of our method leads to more realistic imputed values relative to the state-of-the-art, as well as significant differences in fitted downstream models for mortality.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/12/2018

Learning Representations of Missing Data for Predicting Patient Outcomes

Extracting actionable insight from Electronic Health Records (EHRs) pose...
research
03/20/2021

Modeling Heterogeneity and Missing Data of Multiple Longitudinal Outcomes in Electronic Health Records

In electronic health records (EHRs), latent subgroups of patients may ex...
research
03/12/2021

Medical data wrangling with sequential variational autoencoders

Medical data sets are usually corrupted by noise and missing data. These...
research
01/13/2019

Propensity scores using missingness pattern information: a practical guide

Electronic health records are a valuable data source for investigating h...
research
08/20/2018

Synthetic Patient Generation: A Deep Learning Approach Using Variational Autoencoders

Artificial Intelligence in healthcare is a new and exciting frontier and...
research
12/01/2018

A Probabilistic Model of Cardiac Physiology and Electrocardiograms

An electrocardiogram (EKG) is a common, non-invasive test that measures ...
research
02/01/2023

Development of deep biological ages aware of morbidity and mortality based on unsupervised and semi-supervised deep learning approaches

Background: While deep learning technology, which has the capability of ...

Please sign up or login with your details

Forgot password? Click here to reset