Information gain ratio correction: Improving prediction with more balanced decision tree splits

01/25/2018
by   Antonin Leroux, et al.
0

Decision trees algorithms use a gain function to select the best split during the tree's induction. This function is crucial to obtain trees with high predictive accuracy. Some gain functions can suffer from a bias when it compares splits of different arities. Quinlan proposed a gain ratio in C4.5's information gain function to fix this bias. In this paper, we present an updated version of the gain ratio that performs better as it tries to fix the gain ratio's bias for unbalanced trees and some splits with low predictive interest.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/25/2015

Unifying Decision Trees Split Criteria Using Tsallis Entropy

The construction of efficient and effective decision trees remains a key...
research
10/17/2020

A Convenient Generalization of Schlick's Bias and Gain Functions

We present a generalization of Schlick's bias and gain functions – simpl...
research
05/18/2023

Unbiased Gradient Boosting Decision Tree with Unbiased Feature Importance

Gradient Boosting Decision Tree (GBDT) has achieved remarkable success i...
research
11/17/2012

Cost-sensitive C4.5 with post-pruning and competition

Decision tree is an effective classification approach in data mining and...
research
08/16/2021

Task-wise Split Gradient Boosting Trees for Multi-center Diabetes Prediction

Diabetes prediction is an important data science application in the soci...
research
11/02/2020

A better method to enforce monotonic constraints in regression and classification trees

In this report we present two new ways of enforcing monotone constraints...
research
09/04/2018

Maximizing net income of the auction waterfall with an abort decision tree

An online auction waterfall for an ad impression may contain auctions th...

Please sign up or login with your details

Forgot password? Click here to reset