Handling Incomplete Heterogeneous Data using VAEs

07/10/2018
by   Alfredo Nazabal, et al.
0

Variational autoencoders (VAEs), as well as other generative models, have been shown to be efficient and accurate to capture the latent structure of vast amounts of complex high-dimensional data. However, existing VAEs can still not directly handle data that are heterogenous (mixed continuous and discrete) or incomplete (with missing data at random), which is indeed common in real-world applications. In this paper, we propose a general framework to design VAEs, suitable for fitting incomplete heterogenous data. The proposed HI-VAE includes likelihood models for real-valued, positive real valued, interval, categorical, ordinal and count data, and allows to estimate (and potentially impute) missing data accurately. Furthermore, HI-VAE presents competitive predictive performance in supervised tasks, outperforming supervised models when trained on incomplete data.

READ FULL TEXT
research
04/20/2022

A Variational Autoencoder for Heterogeneous Temporal and Longitudinal Data

The variational autoencoder (VAE) is a popular deep latent variable mode...
research
06/21/2020

VAEM: a Deep Generative Model for Heterogeneous Mixed Type Data

Deep generative models often perform poorly in real-world applications d...
research
01/16/2022

Reconstruction of Incomplete Wildfire Data using Deep Generative Models

We present our submission to the Extreme Value Analysis 2021 Data Challe...
research
08/27/2021

Multimodal Data Fusion in High-Dimensional Heterogeneous Datasets via Generative Models

The commonly used latent space embedding techniques, such as Principal C...
research
03/12/2021

Medical data wrangling with sequential variational autoencoders

Medical data sets are usually corrupted by noise and missing data. These...
research
11/16/2019

Inverse Reinforcement Learning with Missing Data

We consider the problem of recovering an expert's reward function with i...
research
05/28/2023

Learning to Jump: Thinning and Thickening Latent Counts for Generative Modeling

Learning to denoise has emerged as a prominent paradigm to design state-...

Please sign up or login with your details

Forgot password? Click here to reset