Missing Data Estimation in High-Dimensional Datasets: A Swarm Intelligence-Deep Neural Network Approach

07/01/2016
by   Collins Leke, et al.
0

In this paper, we examine the problem of missing data in high-dimensional datasets by taking into consideration the Missing Completely at Random and Missing at Random mechanisms, as well as theArbitrary missing pattern. Additionally, this paper employs a methodology based on Deep Learning and Swarm Intelligence algorithms in order to provide reliable estimates for missing data. The deep learning technique is used to extract features from the input data via an unsupervised learning approach by modeling the data distribution based on the input. This deep learning technique is then used as part of the objective function for the swarm intelligence technique in order to estimate the missing data after a supervised fine-tuning phase by minimizing an error function based on the interrelationship and correlation between features in the dataset. The investigated methodology in this paper therefore has longer running times, however, the promising potential outcomes justify the trade-off. Also, basic knowledge of statistics is presumed.

READ FULL TEXT
research
11/01/2018

HMLasso: Lasso for High Dimensional and Highly Missing Data

Sparse regression such as Lasso has achieved great success in dealing wi...
research
12/04/2015

Proposition of a Theoretical Model for Missing Data Imputation using Deep Learning and Evolutionary Algorithms

In the last couple of decades, there has been major advancements in the ...
research
06/09/2020

VAEs in the Presence of Missing Data

Real world datasets often contain entries with missing elements e.g. in ...
research
01/10/2018

Graphical Models for Processing Missing Data

This paper reviews recent advances in missing data research using graphi...
research
07/19/2018

Unrolling Swiss Cheese: Metric repair on manifolds with holes

For many machine learning tasks, the input data lie on a low-dimensional...
research
07/18/2022

Deeply-Learned Generalized Linear Models with Missing Data

Deep Learning (DL) methods have dramatically increased in popularity in ...
research
09/06/2022

Understanding and Reducing Crater Counting Errors in Citizen Science Data and the Need for Standardisation

Citizen science has become a popular tool for preliminary data processin...

Please sign up or login with your details

Forgot password? Click here to reset