Cardea: An Open Automated Machine Learning Framework for Electronic Health Records

10/01/2020
by   Sarah Alnegheimish, et al.
0

An estimated 180 papers focusing on deep learning and EHR were published between 2010 and 2018. Despite the common workflow structure appearing in these publications, no trusted and verified software framework exists, forcing researchers to arduously repeat previous work. In this paper, we propose Cardea, an extensible open-source automated machine learning framework encapsulating common prediction problems in the health domain and allows users to build predictive models with their own data. This system relies on two components: Fast Healthcare Interoperability Resources (FHIR) – a standardized data structure for electronic health systems – and several AUTOML frameworks for automated feature engineering, model selection, and tuning. We augment these components with an adaptive data assembler and comprehensive data- and model- auditing capabilities. We demonstrate our framework via 5 prediction tasks on MIMIC-III and Kaggle datasets, which highlight Cardea's human competitiveness, flexibility in problem definition, extensive feature generation capability, adaptable automatic data assembler, and its usability.

READ FULL TEXT

page 2

page 3

page 4

page 6

page 9

page 10

page 11

page 12

research
07/19/2019

MIMIC-Extract: A Data Extraction, Preprocessing, and Representation Pipeline for MIMIC-III

Robust machine learning relies on access to data that can be used with s...
research
09/06/2017

Boosting Deep Learning Risk Prediction with Generative Adversarial Networks for Electronic Health Records

The rapid growth of Electronic Health Records (EHRs), as well as the acc...
research
08/08/2021

Unifying Heterogenous Electronic Health Records Systems via Text-Based Code Embedding

Substantial increase in the use of Electronic Health Records (EHRs) has ...
research
08/03/2023

Causal thinking for decision making on Electronic Health Records: why and how

Accurate predictions, as with machine learning, may not suffice to provi...
research
03/25/2021

Deep EHR Spotlight: a Framework and Mechanism to Highlight Events in Electronic Health Records for Explainable Predictions

The wide adoption of Electronic Health Records (EHR) has resulted in lar...
research
11/10/2019

Unsupervised Annotation of Phenotypic Abnormalities via Semantic Latent Representations on Electronic Health Records

The extraction of phenotype information which is naturally contained in ...
research
02/11/2023

Cross-center Early Sepsis Recognition by Medical Knowledge Guided Collaborative Learning for Data-scarce Hospitals

There are significant regional inequities in health resources around the...

Please sign up or login with your details

Forgot password? Click here to reset