Robustly Extracting Medical Knowledge from EHRs: A Case Study of Learning a Health Knowledge Graph

by   Irene Y. Chen, et al.

Increasingly large electronic health records (EHRs) provide an opportunity to algorithmically learn medical knowledge. In one prominent example, a causal health knowledge graph could learn relationships between diseases and symptoms and then serve as a diagnostic tool to be refined with additional clinical input. Prior research has demonstrated the ability to construct such a graph from over 270,000 emergency department patient visits. In this work, we describe methods to evaluate a health knowledge graph for robustness. Moving beyond precision and recall, we analyze for which diseases and for which patients the graph is most accurate. We identify sample size and unmeasured confounders as major sources of error in the health knowledge graph. We introduce a method to leverage non-linear functions in building the causal graph to better understand existing model assumptions. Finally, to assess model generalizability, we extend to a larger set of complete patient visits within a hospital system. We conclude with a discussion on how to robustly extract medical knowledge from EHRs.


page 14

page 15


Modeling electronic health record data using a knowledge-graph-embedded topic model

The rapid growth of electronic health record (EHR) datasets opens up pro...

Refining Diagnosis Paths for Medical Diagnosis based on an Augmented Knowledge Graph

Medical diagnosis is the process of making a prediction of the disease a...

Linking Physicians to Medical Research Results via Knowledge Graph Embeddings and Twitter

Informing professionals about the latest research results in their field...

Focused Clinical Query Understanding and Retrieval of Medical Snippets powered through a Healthcare Knowledge Graph

Clinicians face several significant barriers to search and synthesize ac...

Predicting Patient Readmission Risk from Medical Text via Knowledge Graph Enhanced Multiview Graph Convolution

Unplanned intensive care unit (ICU) readmission rate is an important met...

Semantically-aware population health risk analyses

One primary task of population health analysis is the identification of ...

Knowledge Transfer with Medical Language Embeddings

Identifying relationships between concepts is a key aspect of scientific...