DeepAI AI Chat
Log In Sign Up

Minimal cost feature selection of data with normal distribution measurement errors

by   Hong Zhao, et al.

Minimal cost feature selection is devoted to obtain a trade-off between test costs and misclassification costs. This issue has been addressed recently on nominal data. In this paper, we consider numerical data with measurement errors and study minimal cost feature selection in this model. First, we build a data model with normal distribution measurement errors. Second, the neighborhood of each data item is constructed through the confidence interval. Comparing with discretized intervals, neighborhoods are more reasonable to maintain the information of data. Third, we define a new minimal total cost feature selection problem through considering the trade-off between test costs and misclassification costs. Fourth, we proposed a backtracking algorithm with three effective pruning techniques to deal with this problem. The algorithm is tested on four UCI data sets. Experimental results indicate that the pruning techniques are effective, and the algorithm is efficient for data sets with nearly one thousand objects.


Test-cost-sensitive attribute reduction of data with normal distribution measurement errors

The measurement error with normal distribution is universal in applicati...

Cost-Sensitive Feature Selection by Optimizing F-Measures

Feature selection is beneficial for improving the performance of general...

Cost-sensitive C4.5 with post-pruning and competition

Decision tree is an effective classification approach in data mining and...

Selecting Biomarkers for building optimal treatment selection rules using Kernel Machines

Optimal biomarker combinations for treatment-selection can be derived by...

Implications on Feature Detection when using the Benefit-Cost Ratio

In many practical machine learning applications, there are two objective...

Feature selection with test cost constraint

Feature selection is an important preprocessing step in machine learning...

Online Feature Selection for Efficient Learning in Networked Systems

Current AI/ML methods for data-driven engineering use models that are mo...