Disentangled Attribution Curves for Interpreting Random Forests and Boosted Trees

05/18/2019
by   Summer Devlin, et al.
3

Tree ensembles, such as random forests and AdaBoost, are ubiquitous machine learning models known for achieving strong predictive performance across a wide variety of domains. However, this strong performance comes at the cost of interpretability (i.e. users are unable to understand the relationships a trained random forest has learned and why it is making its predictions). In particular, it is challenging to understand how the contribution of a particular feature, or group of features, varies as their value changes. To address this, we introduce Disentangled Attribution Curves (DAC), a method to provide interpretations of tree ensemble methods in the form of (multivariate) feature importance curves. For a given variable, or group of variables, DAC plots the importance of a variable(s) as their value changes. We validate DAC on real data by showing that the curves can be used to increase the accuracy of logistic regression while maintaining interpretability, by including DAC as an additional feature. In simulation studies, DAC is shown to out-perform competing methods in the recovery of conditional expectations. Finally, through a case-study on the bike-sharing dataset, we demonstrate the use of DAC to uncover novel insights into a dataset.

READ FULL TEXT
research
06/19/2017

Consistent feature attribution for tree ensembles

It is critical in many applications to understand what features are impo...
research
05/01/2023

Interpreting Deep Forest through Feature Contribution and MDI Feature Importance

Deep forest is a non-differentiable deep model which has achieved impres...
research
02/12/2018

Consistent Individualized Feature Attribution for Tree Ensembles

Interpreting predictions from tree ensemble methods such as gradient boo...
research
07/19/2021

Path Integrals for the Attribution of Model Uncertainties

Enabling interpretations of model uncertainties is of key importance in ...
research
02/15/2023

Unboxing Tree Ensembles for interpretability: a hierarchical visualization tool and a multivariate optimal re-built tree

The interpretability of models has become a crucial issue in Machine Lea...
research
05/30/2017

Optimization of Tree Ensembles

Tree ensemble models such as random forests and boosted trees are among ...
research
07/04/2023

Shapley Sets: Feature Attribution via Recursive Function Decomposition

Despite their ubiquitous use, Shapley value feature attributions can be ...

Please sign up or login with your details

Forgot password? Click here to reset