Nonparametric Feature Impact and Importance

06/08/2020
by   Terence Parr, et al.
6

Practitioners use feature importance to rank and eliminate weak predictors during model development in an effort to simplify models and improve generality. Unfortunately, they also routinely conflate such feature importance measures with feature impact, the isolated effect of an explanatory variable on the response variable. This can lead to real-world consequences when importance is inappropriately interpreted as impact for business or medical insight purposes. The dominant approach for computing importances is through interrogation of a fitted model, which works well for feature selection, but gives distorted measures of feature impact. The same method applied to the same data set can yield different feature importances, depending on the model, leading us to conclude that impact should be computed directly from the data. While there are nonparametric feature selection algorithms, they typically provide feature rankings, rather than measures of impact or importance. They also typically focus on single-variable associations with the response. In this paper, we give mathematical definitions of feature impact and importance, derived from partial dependence curves, that operate directly on the data. To assess quality, we show that features ranked by these definitions are competitive with existing feature selection techniques using three real data sets for predictive tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/11/2021

Feature Selection Tutorial with Python Examples

In Machine Learning, feature selection entails selecting a subset of the...
research
07/10/2022

FIB: A Method for Evaluation of Feature Impact Balance in Multi-Dimensional Data

Errors might not have the same consequences depending on the task at han...
research
05/11/2021

Comparing interpretability and explainability for feature selection

A common approach for feature selection is to examine the variable impor...
research
06/18/2020

Leveraging Model Inherent Variable Importance for Stable Online Feature Selection

Feature selection can be a crucial factor in obtaining robust and accura...
research
08/28/2023

Causality-Based Feature Importance Quantifying Methods:PN-FI, PS-FI and PNS-FI

In current ML field models are getting larger and more complex, data we ...
research
09/06/2021

Bringing a Ruler Into the Black Box: Uncovering Feature Impact from Individual Conditional Expectation Plots

As machine learning systems become more ubiquitous, methods for understa...
research
02/04/2022

The impact of feature importance methods on the interpretation of defect classifiers

Classifier specific (CS) and classifier agnostic (CA) feature importance...

Please sign up or login with your details

Forgot password? Click here to reset