A Hamiltonian Monte Carlo Model for Imputation and Augmentation of Healthcare Data

03/03/2021
by   Narges Pourshahrokhi, et al.
8

Missing values exist in nearly all clinical studies because data for a variable or question are not collected or not available. Inadequate handling of missing values can lead to biased results and loss of statistical power in analysis. Existing models usually do not consider privacy concerns or do not utilise the inherent correlations across multiple features to impute the missing values. In healthcare applications, we are usually confronted with high dimensional and sometimes small sample size datasets that need more effective augmentation or imputation techniques. Besides, imputation and augmentation processes are traditionally conducted individually. However, imputing missing values and augmenting data can significantly improve generalisation and avoid bias in machine learning models. A Bayesian approach to impute missing values and creating augmented samples in high dimensional healthcare data is proposed in this work. We propose folded Hamiltonian Monte Carlo (F-HMC) with Bayesian inference as a more practical approach to process the cross-dimensional relations by applying a random walk and Hamiltonian dynamics to adapt posterior distribution and generate large-scale samples. The proposed method is applied to a cancer symptom assessment dataset and confirmed to enrich the quality of data in precision, accuracy, recall, F1 score, and propensity metric.

READ FULL TEXT

page 4

page 5

page 6

page 7

page 9

research
03/27/2020

MCFlow: Monte Carlo Flow Models for Data Imputation

We consider the topic of data imputation, a foundational task in machine...
research
03/06/2015

Hamiltonian ABC

Approximate Bayesian computation (ABC) is a powerful and elegant framewo...
research
02/09/2022

Missing Data Imputation and Acquisition with Deep Hierarchical Models and Hamiltonian Monte Carlo

Variational Autoencoders (VAEs) have recently been highly successful at ...
research
07/31/2021

Missingness Augmentation: A General Approach for Improving Generative Imputation Models

Despite tremendous progress in missing data imputation task, designing n...
research
05/17/2023

Risk Assessment of Lymph Node Metastases in Endometrial Cancer Patients: A Causal Approach

Assessing the pre-operative risk of lymph node metastases in endometrial...
research
01/19/2022

Bayesian Prediction with Covariates Subject to Detection Limits

Missing values in covariates due to censoring by signal interference or ...

Please sign up or login with your details

Forgot password? Click here to reset