Log In Sign Up

DICE: Deep Significance Clustering for Outcome-Aware Stratification

by   Yufang Huang, et al.

We present deep significance clustering (DICE), a framework for jointly performing representation learning and clustering for "outcome-aware" stratification. DICE is intended to generate cluster membership that may be used to categorize a population by individual risk level for a targeted outcome. Following the representation learning and clustering steps, we embed the objective function in DICE with a constraint which requires a statistically significant association between the outcome and cluster membership of learned representations. DICE further includes a neural architecture search step to maximize both the likelihood of representation learning and outcome classification accuracy with cluster membership as the predictor. To demonstrate its utility in medicine for patient risk-stratification, the performance of DICE was evaluated using two datasets with different outcome ratios extracted from real-world electronic health records. Outcomes are defined as acute kidney injury (30.4%) among a cohort of COVID-19 patients, and discharge disposition (36.8%) among a cohort of heart failure patients, respectively. Extensive results demonstrate that DICE has superior performance as measured by the difference in outcome distribution across clusters, Silhouette score, Calinski-Harabasz index, and Davies-Bouldin index for clustering, and Area under the ROC Curve (AUC) for outcome classification compared to several baseline approaches.


page 1

page 2

page 3

page 4


Outcome-Driven Clustering of Acute Coronary Syndrome Patients using Multi-Task Neural Network with Attention

Cluster analysis aims at separating patients into phenotypically heterog...

Temporal Phenotyping using Deep Predictive Clustering of Disease Progression

Due to the wider availability of modern electronic health records, patie...

Language Models Are An Effective Patient Representation Learning Technique For Electronic Health Record Data

Widespread adoption of electronic health records (EHRs) has fueled devel...

Deep Semi-Supervised Embedded Clustering (DSEC) for Stratification of Heart Failure Patients

Determining phenotypes of diseases can have considerable benefits for in...

Predicting adverse outcomes following catheter ablation treatment for atrial fibrillation

Objective: To develop prognostic survival models for predicting adverse ...

Clustering by transitive propagation

We present a global optimization algorithm for clustering data given the...