Machine learning with incomplete datasets using multi-objective optimization models

12/04/2020
by   Hadi A. Khorshidi, et al.
0

Machine learning techniques have been developed to learn from complete data. When missing values exist in a dataset, the incomplete data should be preprocessed separately by removing data points with missing values or imputation. In this paper, we propose an online approach to handle missing values while a classification model is learnt. To reach this goal, we develop a multi-objective optimization model with two objective functions for imputation and model selection. We also propose three formulations for imputation objective function. We use an evolutionary algorithm based on NSGA II to find the optimal solutions as the Pareto solutions. We investigate the reliability and robustness of the proposed model using experiments by defining several scenarios in dealing with missing values and classification. We also describe how the proposed model can contribute to medical informatics. We compare the performance of three different formulations via experimental results. The proposed model results get validated by comparing with a comparable literature.

READ FULL TEXT
research
06/03/2022

PROMISSING: Pruning Missing Values in Neural Networks

While data are the primary fuel for machine learning models, they often ...
research
04/18/2021

Multi-objective Feature Selection with Missing Data in Classification

Feature selection (FS) is an important research topic in machine learnin...
research
02/20/2023

Transformed Distribution Matching for Missing Value Imputation

We study the problem of imputing missing values in a dataset, which has ...
research
04/23/2023

Missing Values and Imputation in Healthcare Data: Can Interpretable Machine Learning Help?

Missing values are a fundamental problem in data science. Many datasets ...
research
09/21/2021

Fairness without Imputation: A Decision Tree Approach for Fair Prediction with Missing Values

We investigate the fairness concerns of training a machine learning mode...
research
05/22/2019

Generative Imputation and Stochastic Prediction

In many machine learning applications, we are faced with incomplete data...
research
01/13/2022

Multi-task longitudinal forecasting with missing values on Alzheimer's Disease

Machine learning techniques typically applied to dementia forecasting la...

Please sign up or login with your details

Forgot password? Click here to reset