Identifying Ventricular Arrhythmias and Their Predictors by Applying Machine Learning Methods to Electronic Health Records in Patients With Hypertrophic Cardiomyopathy(HCM-VAr-

09/19/2021
by   Moumita Bhattacharya, et al.
0

Clinical risk stratification for sudden cardiac death (SCD) in hypertrophic cardiomyopathy (HC) employs rules derived from American College of Cardiology Foundation/American Heart Association (ACCF/AHA) guidelines or the HCM Risk-SCD model (C-index of 0.69), which utilize a few clinical variables. We assessed whether data-driven machine learning methods that consider a wider range of variables can effectively identify HC patients with ventricular arrhythmias (VAr) that lead to SCD. We scanned the electronic health records of 711 HC patients for sustained ventricular tachycardia or ventricular fibrillation. Patients with ventricular tachycardia or ventricular fibrillation (n = 61) were tagged as VAr cases and the remaining (n = 650) as non-VAr. The 2-sample t test and information gain criterion were used to identify the most informative clinical variables that distinguish VAr from non-VAr; patient records were reduced to include only these variables. Data imbalance stemming from low number of VAr cases was addressed by applying a combination of over- and under-sampling strategies.We trained and tested multiple classifiers under this sampling approach, showing effective classification. We evaluated 93 clinical variables, of which 22 proved predictive of VAr. The ensemble of logistic regression and naive Bayes classifiers, trained based on these 22 variables and corrected for data imbalance, was most effective in separating VAr from non-VAr cases (sensitivity = 0.73, specificity = 0.76, C-index = 0.83). Our method (HCM-VAr-Risk Model) identified 12 new predictors of VAr, in addition to 10 established SCD predictors. In conclusion, this is the first application of machine learning for identifying HC patients with VAr, using clinical attributes.

READ FULL TEXT

page 1

page 6

research
09/19/2022

A cost-based multi-layer network approach for the discovery of patient phenotypes

Clinical records frequently include assessments of the characteristics o...
research
04/28/2022

Machine Learning for Violence Risk Assessment Using Dutch Clinical Notes

Violence risk assessment in psychiatric institutions enables interventio...
research
01/13/2020

Model-assisted cohort selection with bias analysis for generating large-scale cohorts from the EHR for oncology research

Objective Electronic health records (EHRs) are a promising source of dat...
research
10/06/2016

A Methodology for Customizing Clinical Tests for Esophageal Cancer based on Patient Preferences

Tests for Esophageal cancer can be expensive, uncomfortable and can have...
research
09/08/2020

Health-behaviors associated with the growing risk of adolescent suicide attempts: A data-driven cross-sectional study

Purpose: Identify and examine the associations between health behaviors ...
research
01/30/2019

Electronic Health Record Phenotyping with Internally Assessable Performance (PhIAP) using Anchor-Positive and Unlabeled Patients

Building phenotype models using electronic health record (EHR) data conv...

Please sign up or login with your details

Forgot password? Click here to reset