Missingness Augmentation: A General Approach for Improving Generative Imputation Models

07/31/2021
by   Yufeng Wang, et al.
0

Despite tremendous progress in missing data imputation task, designing new imputation models has become more and more cumbersome but the corresponding gains are relatively small. Is there any simple but general approach that can exploit the existing models to further improve the quality of the imputation? In this article, we aim to respond to this concern and propose a novel general data augmentation method called Missingness Augmentation (MA), which can be applied in many existing generative imputation frameworks to further improve the performance of these models. For MA, before each training epoch, we use the outputs of the generator to expand the incomplete samples on the fly, and then determine a special reconstruction loss for these augmented samples. This reconstruction loss plus the original loss constitutes the final optimization objective of the model. It is noteworthy that MA is very efficient and does not need to change the structure of the original model. Experimental results demonstrate that MA can significantly improve the performance of many recently developed generative imputation models on a variety of datasets. Our code is available at https://github.com/WYu-Feng/Missingness-Augmentation.

READ FULL TEXT

page 5

page 6

research
06/20/2019

Efficient data augmentation using graph imputation neural networks

Recently, data augmentation in the semi-supervised regime, where unlabel...
research
01/10/2022

Differentiable and Scalable Generative Adversarial Models for Data Imputation

Data imputation has been extensively explored to solve the missing data ...
research
04/10/2023

Missing Data Imputation with Graph Laplacian Pyramid Network

Data imputation is a prevalent and important task due to the ubiquitousn...
research
11/19/2020

Robustness to Missing Features using Hierarchical Clustering with Split Neural Networks

The problem of missing data has been persistent for a long time and pose...
research
09/15/2023

Modelling Irregularly Sampled Time Series Without Imputation

Modelling irregularly-sampled time series (ISTS) is challenging because ...
research
03/03/2021

A Hamiltonian Monte Carlo Model for Imputation and Augmentation of Healthcare Data

Missing values exist in nearly all clinical studies because data for a v...
research
09/13/2019

Flow Models for Arbitrary Conditional Likelihoods

Understanding the dependencies among features of a dataset is at the cor...

Please sign up or login with your details

Forgot password? Click here to reset