Rule induction for global explanation of trained models

08/29/2018
by   Madhumita Sushil, et al.
0

Understanding the behavior of a trained network and finding explanations for its outputs is important for improving the network's performance and generalization ability, and for ensuring trust in automated systems. Several approaches have previously been proposed to identify and visualize the most important features by analyzing a trained network. However, the relations between different features and classes are lost in most cases. We propose a technique to induce sets of if-then-else rules that capture these relations to globally explain the predictions of a network. We first calculate the importance of the features in the trained network. We then weigh the original inputs with these feature importance scores, simplify the transformed input space, and finally fit a rule induction model to explain the model predictions. We find that the output rule-sets can explain the predictions of a neural network trained for 4-class text classification from the 20 newsgroups dataset to a macro-averaged F-score of 0.80. We make the code available at https://github.com/clips/interpret_with_rules.

READ FULL TEXT
research
02/10/2020

Explaining Explanations: Axiomatic Feature Interactions for Deep Networks

Recent work has shown great promise in explaining neural network behavio...
research
04/30/2020

Learning to Faithfully Rationalize by Construction

In many settings it is important for one to be able to understand why a ...
research
06/10/2023

Two-Stage Holistic and Contrastive Explanation of Image Classification

The need to explain the output of a deep neural network classifier is no...
research
06/01/2021

Search Methods for Sufficient, Socially-Aligned Feature Importance Explanations with In-Distribution Counterfactuals

Feature importance (FI) estimates are a popular form of explanation, and...
research
02/03/2021

When Can Models Learn From Explanations? A Formal Framework for Understanding the Roles of Explanation Data

Many methods now exist for conditioning model outputs on task instructio...
research
04/01/2022

Separate and conquer heuristic allows robust mining of contrast sets from various types of data

Identifying differences between groups is one of the most important know...
research
06/19/2023

A Lightweight Causal Model for Interpretable Subject-level Prediction

Recent years have seen a growing interest in methods for predicting a va...

Please sign up or login with your details

Forgot password? Click here to reset