A novel method for Causal Structure Discovery from EHR data, a demonstration on type-2 diabetes mellitus

11/11/2020
by   Xinpeng Shen, et al.
0

Introduction: The discovery of causal mechanisms underlying diseases enables better diagnosis, prognosis and treatment selection. Clinical trials have been the gold standard for determining causality, but they are resource intensive, sometimes infeasible or unethical. Electronic Health Records (EHR) contain a wealth of real-world data that holds promise for the discovery of disease mechanisms, yet the existing causal structure discovery (CSD) methods fall short on leveraging them due to the special characteristics of the EHR data. We propose a new data transformation method and a novel CSD algorithm to overcome the challenges posed by these characteristics. Materials and methods: We demonstrated the proposed methods on an application to type-2 diabetes mellitus. We used a large EHR data set from Mayo Clinic to internally evaluate the proposed transformation and CSD methods and used another large data set from an independent health system, Fairview Health Services, as external validation. We compared the performance of our proposed method to Fast Greedy Equivalence Search (FGES), a state-of-the-art CSD method in terms of correctness, stability and completeness. We tested the generalizability of the proposed algorithm through external validation. Results and conclusions: The proposed method improved over the existing methods by successfully incorporating study design considerations, was robust in face of unreliable EHR timestamps and inferred causal effect directions more correctly and reliably. The proposed data transformation successfully improved the clinical correctness of the discovered graph and the consistency of edge orientation across bootstrap samples. It resulted in superior accuracy, stability, and completeness.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/06/2023

A Fast Bootstrap Algorithm for Causal Inference with Large Data

Estimating causal effects from large experimental and observational data...
research
02/16/2023

Local Causal Discovery for Estimating Causal Effects

Even when the causal graph underlying our data is unknown, we can use ob...
research
10/22/2022

SplitStrains, a tool to identify and separate mixed Mycobacterium tuberculosis infections from WGS data

The occurrence of multiple strains of a bacterial pathogen such as M. tu...
research
01/19/2022

Ordinal Causal Discovery

Causal discovery for purely observational, categorical data is a long-st...
research
11/29/2022

Harnessing electronic health records for real-world evidence

While randomized controlled trials (RCTs) are the gold-standard for esta...
research
03/23/2023

Variational Bayes latent class approach for EHR-based phenotyping with large real-world data

Bayesian approaches to clinical analyses for the purposes of patient phe...
research
07/25/2019

Computational Phenotype Discovery via Probabilistic Independence

Computational Phenotype Discovery research has taken various pragmatic a...

Please sign up or login with your details

Forgot password? Click here to reset