Data mining for censored time-to-event data: A Bayesian network model for predicting cardiovascular risk from electronic health record data

04/08/2014
by   Sunayan Bandyopadhyay, et al.
0

Models for predicting the risk of cardiovascular events based on individual patient characteristics are important tools for managing patient care. Most current and commonly used risk prediction models have been built from carefully selected epidemiological cohorts. However, the homogeneity and limited size of such cohorts restricts the predictive power and generalizability of these risk models to other populations. Electronic health data (EHD) from large health care systems provide access to data on large, heterogeneous, and contemporaneous patient populations. The unique features and challenges of EHD, including missing risk factor information, non-linear relationships between risk factors and cardiovascular event outcomes, and differing effects from different patient subgroups, demand novel machine learning approaches to risk model development. In this paper, we present a machine learning approach based on Bayesian networks trained on EHD to predict the probability of having a cardiovascular event within five years. In such data, event status may be unknown for some individuals as the event time is right-censored due to disenrollment and incomplete follow-up. Since many traditional data mining methods are not well-suited for such data, we describe how to modify both modelling and assessment techniques to account for censored observation times. We show that our approach can lead to better predictive performance than the Cox proportional hazards model (i.e., a regression-based approach commonly used for censored, time-to-event data) or a Bayesian network with ad hoc approaches to right-censoring. Our techniques are motivated by and illustrated on data from a large U.S. Midwestern health care system.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/08/2014

A Naive Bayes machine learning approach to risk prediction using censored, time-to-event data

Predicting an individual's risk of experiencing a future clinical outcom...
research
04/12/2022

Hybrid Feature- and Similarity-Based Models for Prediction and Interpretation using Large-Scale Observational Data

Introduction: Large-scale electronic health record(EHR) datasets often i...
research
12/28/2021

A Bayesian network model for predicting cardiovascular risk

We propose a Bayesian network model to make inferences and predictions a...
research
11/28/2017

Predicting Adolescent Suicide Attempts with Neural Networks

Though suicide is a major public health problem in the US, machine learn...
research
11/18/2019

Predicting colorectal polyp recurrence using time-to-event analysis of medical records

Identifying patient characteristics that influence the rate of colorecta...
research
06/07/2019

Early detection of sepsis utilizing deep learning on electronic health record event sequences

The timeliness of detection of a sepsis event in progress is a crucial f...
research
04/20/2023

Automated Dynamic Bayesian Networks for Predicting Acute Kidney Injury Before Onset

Several algorithms for learning the structure of dynamic Bayesian networ...

Please sign up or login with your details

Forgot password? Click here to reset