Bayesian Double Feature Allocation for Phenotyping with Electronic Health Records

09/04/2018
by   Yang Ni, et al.
0

We propose a categorical matrix factorization method to infer latent diseases from electronic health records (EHR) data in an unsupervised manner. A latent disease is defined as an unknown biological aberration that causes a set of common symptoms for a group of patients. The proposed approach is based on a novel double feature allocation model which simultaneously allocates features to the rows and the columns of a categorical matrix. Using a Bayesian approach, available prior information on known diseases greatly improves identifiability and interpretability of latent diseases. This includes known diagnoses for patients and known association of diseases with symptoms. We validate the proposed approach by simulation studies including mis-specified models and comparison with sparse latent factor models. In the application to Chinese EHR data, we find interesting results, some of which agree with related clinical and medical knowledge.

READ FULL TEXT
research
05/17/2020

Bayesian biclustering for microbial metagenomic sequencing data via multinomial matrix factorization

High-throughput sequencing technology provides unprecedented opportuniti...
research
09/26/2019

Enhancing Model Interpretability and Accuracy for Disease Progression Prediction via Phenotype-Based Patient Similarity Learning

Models have been proposed to extract temporal patterns from longitudinal...
research
03/22/2023

ExBEHRT: Extended Transformer for Electronic Health Records to Predict Disease Subtypes Progressions

In this study, we introduce ExBEHRT, an extended version of BEHRT (BERT ...
research
06/28/2019

Consensus Monte Carlo for Random Subsets using Shared Anchors

We present a consensus Monte Carlo algorithm that scales existing Bayesi...
research
04/02/2020

Surrogate-assisted performance tuning of knowledge discovery algorithms: application to clinical pathway evolutionary modeling

The paper proposes an approach for surrogate-assisted tuning of knowledg...
research
06/12/2018

Are My EHRs Private Enough? -Event-level Privacy Protection

Privacy is a major concern in sharing human subject data to researchers ...
research
11/12/2021

Bayesian Knockoff Generators for Robust Inference Under Complex Data Structure

The recent proliferation of medical data, such as genetics and electroni...

Please sign up or login with your details

Forgot password? Click here to reset