Predicting Chronic Disease Hospitalizations from Electronic Health Records: An Interpretable Classification Approach

01/03/2018
by   Theodora S. Brisimi, et al.
0

Urban living in modern large cities has significant adverse effects on health, increasing the risk of several chronic diseases. We focus on the two leading clusters of chronic disease, heart disease and diabetes, and develop data-driven methods to predict hospitalizations due to these conditions. We base these predictions on the patients' medical history, recent and more distant, as described in their Electronic Health Records (EHR). We formulate the prediction problem as a binary classification problem and consider a variety of machine learning methods, including kernelized and sparse Support Vector Machines (SVM), sparse logistic regression, and random forests. To strike a balance between accuracy and interpretability of the prediction, which is important in a medical setting, we propose two novel methods: K-LRT, a likelihood ratio test-based method, and a Joint Clustering and Classification (JCC) method which identifies hidden patient clusters and adapts classifiers to each cluster. We develop theoretical out-of-sample guarantees for the latter method. We validate our algorithms on large datasets from the Boston Medical Center, the largest safety-net hospital system in New England.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/28/2017

Predicting Adolescent Suicide Attempts with Neural Networks

Though suicide is a major public health problem in the US, machine learn...
research
10/06/2016

A Methodology for Customizing Clinical Tests for Esophageal Cancer based on Patient Preferences

Tests for Esophageal cancer can be expensive, uncomfortable and can have...
research
06/18/2022

Tree-Guided Rare Feature Selection and Logic Aggregation with Electronic Health Records Data

Statistical learning with a large number of rare binary features is comm...
research
07/19/2019

Learning Multimorbidity Patterns from Electronic Health Records Using Non-negative Matrix Factorisation

Multimorbidity, or the presence of several medical conditions in the sam...
research
06/27/2012

Demand-Driven Clustering in Relational Domains for Predicting Adverse Drug Events

Learning from electronic medical records (EMR) is challenging due to the...
research
12/02/2019

On Classifying Sepsis Heterogeneity in the ICU: Insight Using Machine Learning

Current machine learning models aiming to predict sepsis from Electronic...
research
04/28/2022

Cumulative Stay-time Representation for Electronic Health Records in Medical Event Time Prediction

We address the problem of predicting when a disease will develop, i.e., ...

Please sign up or login with your details

Forgot password? Click here to reset