Metafeatures-based Rule-Extraction for Classifiers on Behavioral and Textual Data

03/10/2020
by   Yanou Ramon, et al.
0

Machine learning using behavioral and text data can result in highly accurate prediction models, but these are often very difficult to interpret. Linear models require investigating thousands of coefficients, while the opaqueness of nonlinear models makes things even worse. Rule-extraction techniques have been proposed to combine the desired predictive behaviour of complex "black-box" models with explainability. However, rule-extraction in the context of ultra-high-dimensional and sparse data can be challenging, and has thus far received scant attention. Because of the sparsity and massive dimensionality, rule-extraction might fail in their primary explainability goal as the black-box model may need to be replaced by many rules, leaving the user again with an incomprehensible model. To address this problem, we develop and test a rule-extraction methodology based on higher-level, less-sparse "metafeatures". We empirically validate the quality of the rules in terms of fidelity, explanation stability and accuracy over a collection of data sets, and benchmark their performance against rules extracted using the original features. Our analysis points to key trade-offs between explainability, fidelity, accuracy, and stability that Machine Learning researchers and practitioners need to consider. Results indicate that the proposed metafeatures approach leads to better trade-offs between these, and is better able to mimic the black-box model. There is an average decrease of the loss in fidelity, accuracy, and stability from using metafeatures instead of the original fine-grained features by respectively 18.08 statistically significant at a 5 a key "cost of explainability", which we define as the loss in fidelity when replacing a black-box with an explainable model.

READ FULL TEXT

page 12

page 16

research
02/18/2019

Regularizing Black-box Models for Improved Interpretability

Most work on interpretability in machine learning has focused on designi...
research
05/31/2019

Regularizing Black-box Models for Improved Interpretability (HILL 2019 Version)

Most of the work on interpretable machine learning has focused on design...
research
01/28/2019

Fairwashing: the risk of rationalization

Black-box explanation is the problem of explaining how a machine learnin...
research
07/02/2020

Am I Building a White Box Agent or Interpreting a Black Box Agent?

The rule extraction literature contains the notion of a fidelity-accurac...
research
04/21/2022

Evolution of Transparent Explainable Rule-sets

Most AI systems are black boxes generating reasonable outputs for given ...
research
10/18/2018

Entropic Variable Boosting for Explainability and Interpretability in Machine Learning

In this paper, we present a new explainability formalism to make clear t...
research
02/11/2020

Lifting Interpretability-Performance Trade-off via Automated Feature Engineering

Complex black-box predictive models may have high performance, but lack ...

Please sign up or login with your details

Forgot password? Click here to reset