Missing Data Imputation and Acquisition with Deep Hierarchical Models and Hamiltonian Monte Carlo

02/09/2022
by   Ignacio Peis, et al.
8

Variational Autoencoders (VAEs) have recently been highly successful at imputing and acquiring heterogeneous missing data and identifying outliers. However, within this specific application domain, existing VAE methods are restricted by using only one layer of latent variables and strictly Gaussian posterior approximations. To address these limitations, we present HH-VAEM, a Hierarchical VAE model for mixed-type incomplete data that uses Hamiltonian Monte Carlo with automatic hyper-parameter tuning for improved approximate inference. Our experiments show that HH-VAEM outperforms existing baselines in the tasks of missing data imputation, supervised learning and outlier identification with missing features. Finally, we also present a sampling-based approach for efficiently computing the information gain when missing features are to be acquired with HH-VAEM. Our experiments show that this sampling-based approach is superior to alternatives based on Gaussian approximations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/13/2020

Multiple Imputation for Biomedical Data using Monte Carlo Dropout Autoencoders

Due to complex experimental settings, missing values are common in biome...
research
03/27/2020

MCFlow: Monte Carlo Flow Models for Data Imputation

We consider the topic of data imputation, a foundational task in machine...
research
08/17/2023

Conditional Sampling of Variational Autoencoders via Iterated Approximate Ancestral Sampling

Conditional sampling of variational autoencoders (VAEs) is needed in var...
research
06/17/2020

Analytical Probability Distributions and EM-Learning for Deep Generative Networks

Deep Generative Networks (DGNs) with probabilistic modeling of their out...
research
03/03/2021

A Hamiltonian Monte Carlo Model for Imputation and Augmentation of Healthcare Data

Missing values exist in nearly all clinical studies because data for a v...
research
11/07/2022

Monte Carlo Techniques for Addressing Large Errors and Missing Data in Simulation-based Inference

Upcoming astronomical surveys will observe billions of galaxies across c...
research
02/12/2016

A Minimalistic Approach to Sum-Product Network Learning for Real Applications

Sum-Product Networks (SPNs) are a class of expressive yet tractable hier...

Please sign up or login with your details

Forgot password? Click here to reset