A Deep Learning Pipeline for Patient Diagnosis Prediction Using Electronic Health Records

06/23/2020
by   Leopold Franz, et al.
0

Augmentation of disease diagnosis and decision-making in healthcare with machine learning algorithms is gaining much impetus in recent years. In particular, in the current epidemiological situation caused by COVID-19 pandemic, swift and accurate prediction of disease diagnosis with machine learning algorithms could facilitate identification and care of vulnerable clusters of population, such as those having multi-morbidity conditions. In order to build a useful disease diagnosis prediction system, advancement in both data representation and development of machine learning architectures are imperative. First, with respect to data collection and representation, we face severe problems due to multitude of formats and lack of coherency prevalent in Electronic Health Records (EHRs). This causes hindrance in extraction of valuable information contained in EHRs. Currently, no universal global data standard has been established. As a useful solution, we develop and publish a Python package to transform public health dataset into an easy to access universal format. This data transformation to an international health data format facilitates researchers to easily combine EHR datasets with clinical datasets of diverse formats. Second, machine learning algorithms that predict multiple disease diagnosis categories simultaneously remain underdeveloped. We propose two novel model architectures in this regard. First, DeepObserver, which uses structured numerical data to predict the diagnosis categories and second, ClinicalBERT_Multi, that incorporates rich information available in clinical notes via natural language processing methods and also provides interpretable visualizations to medical practitioners. We show that both models can predict multiple diagnoses simultaneously with high accuracy.

READ FULL TEXT
research
03/26/2018

Deep Representation for Patient Visits from Electronic Health Records

We show how to learn low-dimensional representations (embeddings) of pat...
research
09/26/2020

Bidirectional Representation Learning from Transformers using Multimodal Electronic Health Record Data for Chronic to Predict Depression

Advancements in machine learning algorithms have had a beneficial impact...
research
09/24/2021

MIIDL: a Python package for microbial biomarkers identification powered by interpretable deep learning

Detecting microbial biomarkers used to predict disease phenotypes and cl...
research
08/14/2019

Two-stage Federated Phenotyping and Patient Representation Learning

A large percentage of medical information is in unstructured text format...
research
12/09/2021

Context-aware Health Event Prediction via Transition Functions on Dynamic Disease Graphs

With the wide application of electronic health records (EHR) in healthca...
research
07/23/2023

Early Prediction of Alzheimers Disease Leveraging Symptom Occurrences from Longitudinal Electronic Health Records of US Military Veterans

Early prediction of Alzheimer's disease (AD) is crucial for timely inter...
research
06/18/2022

Tree-Guided Rare Feature Selection and Logic Aggregation with Electronic Health Records Data

Statistical learning with a large number of rare binary features is comm...

Please sign up or login with your details

Forgot password? Click here to reset