DeepAI AI Chat
Log In Sign Up

Cost-sensitive C4.5 with post-pruning and competition

11/17/2012
by   Zilong Xu, et al.
NetEase, Inc
0

Decision tree is an effective classification approach in data mining and machine learning. In applications, test costs and misclassification costs should be considered while inducing decision trees. Recently, some cost-sensitive learning algorithms based on ID3 such as CS-ID3, IDX, λ-ID3 have been proposed to deal with the issue. These algorithms deal with only symbolic data. In this paper, we develop a decision tree algorithm inspired by C4.5 for numeric data. There are two major issues for our algorithm. First, we develop the test cost weighted information gain ratio as the heuristic information. According to this heuristic information, our algorithm is to pick the attribute that provides more gain ratio and costs less for each selection. Second, we design a post-pruning strategy through considering the tradeoff between test costs and misclassification costs of the generated decision tree. In this way, the total cost is reduced. Experimental results indicate that (1) our algorithm is stable and effective; (2) the post-pruning technique reduces the total cost significantly; (3) the competition strategy is effective to obtain a cost-sensitive decision tree with low cost.

READ FULL TEXT

page 12

page 13

09/24/2015

CRDT: Correlation Ratio Based Decision Tree Model for Healthcare Data Mining

The phenomenal growth in the healthcare data has inspired us in investig...
11/12/2012

Minimal cost feature selection of data with normal distribution measurement errors

Minimal cost feature selection is devoted to obtain a trade-off between ...
01/25/2018

Information gain ratio correction: Improving prediction with more balanced decision tree splits

Decision trees algorithms use a gain function to select the best split d...
04/12/2016

Confidence Decision Trees via Online and Active Learning for Streaming (BIG) Data

Decision tree classifiers are a widely used tool in data stream mining. ...
06/03/2011

An Analysis of Reduced Error Pruning

Top-down induction of decision trees has been observed to suffer from th...
06/18/2017

Data set operations to hide decision tree rules

This paper focuses on preserving the privacy of sensitive patterns when ...
10/18/2017

On Using Linear Diophantine Equations to Tune the extent of Look Ahead while Hiding Decision Tree Rules

This paper focuses on preserving the privacy of sensitive pat-terns when...