Generative Imputation and Stochastic Prediction

05/22/2019
by   Mohammad Kachuee, et al.
0

In many machine learning applications, we are faced with incomplete datasets. In the literature, missing data imputation techniques have been mostly concerned with filling missing values. However, the existence of missing values is synonymous with uncertainties not only over the distribution of missing values but also over target class assignments that require careful consideration. The objectives of this paper are twofold. First, we proposed a method for generating imputations from the conditional distribution of missing values given observed values. Second, we use the generated samples to estimate the distribution of target assignments given incomplete data. In order to generate imputations, we train a simple and effective generator network to generate imputations that a discriminator network is tasked to distinguish. Following this, a predictor network is trained using imputed samples from the generator network to capture the classification uncertainties and make predictions accordingly. The proposed method is evaluated on CIFAR-10 image dataset as well as two real-world tabular classification datasets, under various missingness rates and structures. Our experimental results show the effectiveness of the proposed method in generating imputations, as well as providing estimates for the class uncertainties in a classification task when faced with missing values.

READ FULL TEXT
research
02/20/2023

Transformed Distribution Matching for Missing Value Imputation

We study the problem of imputing missing values in a dataset, which has ...
research
06/16/2022

Classification of datasets with imputed missing values: does imputation quality matter?

Classifying samples in incomplete datasets is a common aim for machine l...
research
02/08/2016

Adaptive imputation of missing values for incomplete pattern classification

In classification of incomplete pattern, the missing values can either p...
research
06/03/2022

PROMISSING: Pruning Missing Values in Neural Networks

While data are the primary fuel for machine learning models, they often ...
research
12/04/2020

Machine learning with incomplete datasets using multi-objective optimization models

Machine learning techniques have been developed to learn from complete d...
research
07/03/2018

Recovering gaps in the gamma-ray logging method

The gamma-ray logging method is one of the mandatory well logging method...
research
03/30/2020

Imputation of missing sub-hourly precipitation data in a large sensor network: a machine learning approach

Precipitation data from rain gauges is fundamental across many lines of ...

Please sign up or login with your details

Forgot password? Click here to reset