Unbiased Measurement of Feature Importance in Tree-Based Methods

03/12/2019
by   Zhengze Zhou, et al.
0

We propose a modification that corrects for split-improvement variable importance measures in Random Forests and other tree-based methods. These methods have been shown to be biased towards increasing the importance of features with more potential splits. We show that by appropriately incorporating split-improvement as measured on out of sample data, this bias can be corrected yielding better summaries and screening tools.

READ FULL TEXT
research
05/18/2023

Unbiased Gradient Boosting Decision Tree with Unbiased Feature Importance

Gradient Boosting Decision Tree (GBDT) has achieved remarkable success i...
research
04/29/2022

A study of tree-based methods and their combination

Tree-based methods are popular machine learning techniques used in vario...
research
04/28/2019

Weighted Dark Channel Dehazing

In dark channel based methods, local constant assumption is widely used ...
research
03/04/2020

Unbiased variable importance for random forests

The default variable-importance measure in random Forests, Gini importan...
research
03/26/2020

From unbiased MDI Feature Importance to Explainable AI for Trees

We attempt to give a unifying view of the various recent attempts to (i)...
research
11/05/2020

Nonparametric Variable Screening with Optimal Decision Stumps

Decision trees and their ensembles are endowed with a rich set of diagno...
research
07/04/2023

MDI+: A Flexible Random Forest-Based Feature Importance Framework

Mean decrease in impurity (MDI) is a popular feature importance measure ...

Please sign up or login with your details

Forgot password? Click here to reset