The Berkelmans-Pries Feature Importance Method: A Generic Measure of Informativeness of Features

01/11/2023
by   Joris Pries, et al.
0

Over the past few years, the use of machine learning models has emerged as a generic and powerful means for prediction purposes. At the same time, there is a growing demand for interpretability of prediction models. To determine which features of a dataset are important to predict a target variable Y, a Feature Importance (FI) method can be used. By quantifying how important each feature is for predicting Y, irrelevant features can be identified and removed, which could increase the speed and accuracy of a model, and moreover, important features can be discovered, which could lead to valuable insights. A major problem with evaluating FI methods, is that the ground truth FI is often unknown. As a consequence, existing FI methods do not give the exact correct FI values. This is one of the many reasons why it can be hard to properly interpret the results of an FI method. Motivated by this, we introduce a new global approach named the Berkelmans-Pries FI method, which is based on a combination of Shapley values and the Berkelmans-Pries dependency function. We prove that our method has many useful properties, and accurately predicts the correct FI values for several cases where the ground truth FI can be derived in an exact manner. We experimentally show for a large collection of FI methods (468) that existing methods do not have the same useful properties. This shows that the Berkelmans-Pries FI method is a highly valuable tool for analyzing datasets with complex interdependencies.

READ FULL TEXT

page 24

page 26

page 27

page 30

page 31

page 32

page 34

page 35

research
07/23/2019

BIM: Towards Quantitative Evaluation of Interpretability Methods with Ground Truth

Interpretability is rising as an important area of research in machine l...
research
09/03/2021

Predicting Process Name from Network Data

The ability to identify applications based on the network data they gene...
research
07/14/2020

Predicting feature imputability in the absence of ground truth

Data imputation is the most popular method of dealing with missing value...
research
04/06/2022

AutoCOR: Autonomous Condylar Offset Ratio Calculator on TKA-Postoperative Lateral Knee X-ray

The postoperative range of motion is one of the crucial factors indicati...
research
11/14/2021

Scrutinizing XAI using linear ground-truth data with suppressor variables

Machine learning (ML) is increasingly often used to inform high-stakes d...
research
05/22/2017

A Unified Approach to Interpreting Model Predictions

Understanding why a model makes a certain prediction can be as crucial a...
research
05/14/2023

A Unifying Formal Approach to Importance Values in Boolean Functions

Boolean functions and their representation through logics, circuits, mac...

Please sign up or login with your details

Forgot password? Click here to reset