Adaptive imputation of missing values for incomplete pattern classification

by   Zhun-Ga Liu, et al.

In classification of incomplete pattern, the missing values can either play a crucial role in the class determination, or have only little influence (or eventually none) on the classification results according to the context. We propose a credal classification method for incomplete pattern with adaptive imputation of missing values based on belief function theory. At first, we try to classify the object (incomplete pattern) based only on the available attribute values. As underlying principle, we assume that the missing information is not crucial for the classification if a specific class for the object can be found using only the available information. In this case, the object is committed to this particular class. However, if the object cannot be classified without ambiguity, it means that the missing values play a main role for achieving an accurate classification. In this case, the missing values will be imputed based on the K-nearest neighbor (K-NN) and self-organizing map (SOM) techniques, and the edited pattern with the imputation is then classified. The (original or edited) pattern is respectively classified according to each training class, and the classification results represented by basic belief assignments are fused with proper combination rules for making the credal classification. The object is allowed to belong with different masses of belief to the specific classes and meta-classes (which are particular disjunctions of several single classes). The credal classification captures well the uncertainty and imprecision of classification, and reduces effectively the rate of misclassifications thanks to the introduction of meta-classes. The effectiveness of the proposed method with respect to other classical methods is demonstrated based on several experiments using artificial and real data sets.


Generative Imputation and Stochastic Prediction

In many machine learning applications, we are faced with incomplete data...

Using Association Rules for Better Treatment of Missing Values

The quality of training data for knowledge discovery in databases (KDD) ...

Numerical Data Imputation for Multimodal Data Sets: A Probabilistic Nearest-Neighbor Kernel Density Approach

Numerical data imputation algorithms replace missing values by estimates...

The Missing Indicator Method: From Low to High Dimensions

Missing data is common in applied data science, particularly for tabular...

Leachable Component Clustering

Clustering attempts to partition data instances into several distinctive...

Fast Imbalanced Classification of Healthcare Data with Missing Values

In medical domain, data features often contain missing values. This can ...

Uncovering the Missing Pattern: Unified Framework Towards Trajectory Imputation and Prediction

Trajectory prediction is a crucial undertaking in understanding entity m...

Please sign up or login with your details

Forgot password? Click here to reset