Comparing Explanation Methods for Traditional Machine Learning Models Part 1: An Overview of Current Methods and Quantifying Their Disagreement

11/16/2022
by   Montgomery Flora, et al.
0

With increasing interest in explaining machine learning (ML) models, the first part of this two-part study synthesizes recent research on methods for explaining global and local aspects of ML models. This study distinguishes explainability from interpretability, local from global explainability, and feature importance versus feature relevance. We demonstrate and visualize different explanation methods, how to interpret them, and provide a complete Python package (scikit-explain) to allow future researchers to explore these products. We also highlight the frequent disagreement between explanation methods for feature rankings and feature effects and provide practical advice for dealing with these disagreements. We used ML models developed for severe weather prediction and sub-freezing road surface temperature prediction to generalize the behavior of the different explanation methods. For feature rankings, there is substantially more agreement on the set of top features (e.g., on average, two methods agree on 6 of the top 10 features) than on specific rankings (on average, two methods only agree on the ranks of 2-3 features in the set of top 10 features). On the other hand, two feature effect curves from different methods are in high agreement as long as the phase space is well sampled. Finally, a lesser-known method, tree interpreter, was found comparable to SHAP for feature effects, and with the widespread use of random forests in geosciences and computational ease of tree interpreter, we recommend it be explored in future research.

READ FULL TEXT

page 6

page 14

page 16

page 17

research
09/30/2021

On the Trustworthiness of Tree Ensemble Explainability Methods

The recent increase in the deployment of machine learning models in crit...
research
08/13/2021

Data-driven advice for interpreting local and global model predictions in bioinformatics problems

Tree-based algorithms such as random forests and gradient boosted trees ...
research
07/21/2016

Explaining Classification Models Built on High-Dimensional Sparse Data

Predictive modeling applications increasingly use data representing peop...
research
07/04/2019

On Explaining Machine Learning Models by Evolving Crucial and Compact Features

Feature construction can substantially improve the accuracy of Machine L...
research
10/10/2022

DALE: Differential Accumulated Local Effects for efficient and accurate global explanations

Accumulated Local Effect (ALE) is a method for accurately estimating fea...
research
11/23/2021

Is Shapley Explanation for a model unique?

Shapley value has recently become a popular way to explain the predictio...

Please sign up or login with your details

Forgot password? Click here to reset