A latent topic model for mining heterogenous non-randomly missing electronic health records data

11/01/2018
by   Yue Li, et al.
0

Electronic health records (EHR) are rich heterogeneous collection of patient health information, whose broad adoption provides great opportunities for systematic health data mining. However, heterogeneous EHR data types and biased ascertainment impose computational challenges. Here, we present mixEHR, an unsupervised generative model integrating collaborative filtering and latent topic models, which jointly models the discrete distributions of data observation bias and actual data using latent disease-topic distributions. We apply mixEHR on 12.8 million phenotypic observations from the MIMIC dataset, and use it to reveal latent disease topics, interpret EHR results, impute missing data, and predict mortality in intensive care units. Using both simulation and real data, we show that mixEHR outperforms previous methods and reveals meaningful multi-disease insights.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/12/2020

Improving information retrieval from electronic health records using dynamic and multi-collaborative filtering

Due to the rapid growth of information available about individual patien...
research
05/04/2021

Supervised multi-specialist topic model with applications on large-scale electronic health record data

Motivation: Electronic health record (EHR) data provides a new venue to ...
research
04/15/2022

Unsupervised Probabilistic Models for Sequential Electronic Health Records

We develop an unsupervised probabilistic model for heterogeneous Electro...
research
04/22/2018

HeteroMed: Heterogeneous Information Network for Medical Diagnosis

With the recent availability of Electronic Health Records (EHR) and grea...
research
01/13/2019

Propensity scores using missingness pattern information: a practical guide

Electronic health records are a valuable data source for investigating h...
research
06/03/2022

Modeling electronic health record data using a knowledge-graph-embedded topic model

The rapid growth of electronic health record (EHR) datasets opens up pro...
research
11/14/2022

Phenotype Detection in Real World Data via Online MixEHR Algorithm

Understanding patterns of diagnoses, medications, procedures, and labora...

Please sign up or login with your details

Forgot password? Click here to reset