Feature Robustness in Non-stationary Health Records: Caveats to Deployable Model Performance in Common Clinical Machine Learning Tasks

08/02/2019
by   Bret Nestor, et al.
0

When training clinical prediction models from electronic health records (EHRs), a key concern should be a model's ability to sustain performance over time when deployed, even as care practices, database systems, and population demographics evolve. Due to de-identification requirements, however, current experimental practices for public EHR benchmarks (such as the MIMIC-III critical care dataset) are time agnostic, assigning care records to train or test sets without regard for the actual dates of care. As a result, current benchmarks cannot assess how well models trained on one year generalise to another. In this work, we obtain a Limited Data Use Agreement to access year of care for each record in MIMIC and show that all tested state-of-the-art models decay in prediction quality when trained on historical data and tested on future data, particularly in response to a system-wide record-keeping change in 2008 (0.29 drop in AUROC for mortality prediction, 0.10 drop in AUROC for length-of-stay prediction with a random forest classifier). We further develop a simple yet effective mitigation strategy: by aggregating raw features into expert-defined clinical concepts, we see only a 0.06 drop in AUROC for mortality prediction and a 0.03 drop in AUROC for length-of-stay prediction. We demonstrate that this aggregation strategy outperforms other automatic feature preprocessing techniques aimed at increasing robustness to data drift. We release our aggregated representations and code to encourage more deployable clinical prediction models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/30/2018

Rethinking clinical prediction: Why machine learning must consider year of care and feature aggregation

Machine learning for healthcare often trains models on de-identified dat...
research
05/26/2022

Looking for Out-of-Distribution Environments in Critical Care: A case study with the eICU Database

Generalizing to new populations and domains in machine learning is still...
research
06/01/2020

A Machine Learning System for Retaining Patients in HIV Care

Retaining persons living with HIV (PLWH) in medical care is paramount to...
research
07/19/2019

MIMIC-Extract: A Data Extraction, Preprocessing, and Representation Pipeline for MIMIC-III

Robust machine learning relies on access to data that can be used with s...
research
10/02/2019

Benchmarking machine learning models on eICU critical care dataset

Progress of machine learning in critical care has been difficult to trac...
research
04/24/2023

FineEHR: Refine Clinical Note Representations to Improve Mortality Prediction

Monitoring the health status of patients in the ICU is crucial for provi...
research
05/25/2023

Ensemble Synthetic EHR Generation for Increasing Subpopulation Model's Performance

Electronic health records (EHR) often contain different rates of represe...

Please sign up or login with your details

Forgot password? Click here to reset