In recent years, the fast-growing computing power, enormous consumer and commercial data, and emerging advanced machine learning algorithms jointly stimulate the prosperous of AI, which has gone from a science-fiction dream to a critical part of our daily life. Compared to traditional machine learning methods, deep learning has achieved superior performance in perception tasks such as object detection and classification. However, because of the nested non-linear structure, deep learning models usually remain black boxes that are particularly weak in the explainability of the reasoning process and prediction results. In many real-world mission-critical applications, transparency of deep learning models and explainability of the model outputs are essential and necessary in their real deployment process.
Explanable AI is not only a gateway between AI and society but also a powerful tool to detect flaws in the model and biases in the data. The development of techniques on explainability and transparency of deep learning models has recently received much attention in the research community . The relevant research roughly falls into two categories: global explainability and local explainability. Global explainability aims at making the reasoning process wholly transparent and comprehensive , while local explainability focuses on extracting input regions that are highly sensitive to the network output to provide explanations for each decision .
An effective way to achieve explainability is to use a light-weight function family to create interpretable models. Local interpretable model-agnostic explanations (LIME) identify an interpretable model over the human-interpretable representation that is locally faithful to the original model . LIME adopts the linear regression as its interpretable function, which represents the prediction as a linear combination of a few selected features to make the prediction process transparent. However, being so restricted and usually over-simplifying the relationships, linear regression models fail in some situations where non-linear associations and interactions exist among features and prediction results
In this paper, we propose a Decision Tree-based Local Interpretable Model-agnostic Explanation (TLIME). The decision tree structure creates good explanations as the data ends up in distinct groups that are often easy to understand. Moreover, the tree structure can capture interactions between features in the data. We perform various experiments on explaining two black-box models, the random-forest classifier and Google’s pre-trained Inception neural network. The results show that decision tree explanations achieve more reliable performance than original LIME in terms of understandability, fidelity, and efficiency.
2 Interpretable models
Using a subset of algorithms from a light-weight function family to create interpretable models is an effective way to achieve interpretability. In this section, we analyze two representative interpretable models - the linear regression model and the decision tree model. Table 1 shows the properties of two interpretable models. The linear regression displays the prediction as a linear combination of features, while the decision tree represents the reasoning process in a hierarchical structure, which is suitable for capturing the nonlinear association between features and predictions. The monotonicity constraint shown in both models is necessary to ensure the consistency between a feature and the target outcome. Moreover, the decision tree model can automatically capture the diverse interactions between features to predict the target outcome, applicable to both classification and regression tasks.
|Decision trees||No||Some||Yes||Classification, Regression|
Depending on the different criteria, various algorithms are capable of constructing a decision tree. The CART 
is the most popular algorithm which can handle both classification and regression tasks. In this paper, we mainly construct regression decision trees to explain the prediction probability of the image classifier. Figure1 illustrates a simple regression tree to explain image classification prediction made by Google’s Inception neural network. The predicted top class label is . The highlighted superpixels give intuition as to why the model would choose that class. The decision tree shows that if feature , , and exist, then the prediction probability is , which is the mean value of the instances in this node. Moreover, The importance of the three features is , , and
, showing the contribution of the three features in improving the variance.
3 The TLIME Approach
3.1 Characteristics of TLIME
Despite the fact that the amount of research in explainable AI is growing actively, there is no universal consensus on the exact definition of interpretability and its measurement criterion . Ruping first noted that interpretability is composed of three goals - accuracy, understandability, and efficiency . We argue that fidelity is a better description than accuracy since accuracy is easily confused with the performance evaluation criteria of the original black box model. These three goals are inextricably intertwined and competing with each other, as shown in Figure 2. An explainable model with good interpretability should be faithful to the data and the original model, understandable to the observer and graspable in a short time so that the end-users can make wise decisions.
TLIME has many appealing characteristics, such as interpretable, local fidelity, and model-agnostic. It provides a qualitative understanding of features and predictions. It is challenging, if not impossible, to be utterly faithful to the black box model on a global scale. TLIME takes a feasible approach by approximating it in the vicinity of an instance being predicted. Besides, TLIME, as a model-agnostic interpretation, shows excellent flexibility and capability of explaining any underlying machine learning model.
3.2 Explanation System of TLIME
Considering the poor interpretability and high computational complexity of the pixel-based image representation, we adopt a superpixel based explanation system. Each superpixel, as the basic processing unit, is a group of connected pixels with similar colors or gray levels. Figure 3 shows the pixel-based image, superpixel image, and superpixel-based explanation. The interpretable representation of an image consisting of pixels and
superpixels is a binary vectorwhere indicates the presence of original superpixel and indicates absence of original superpixel.
We denote the original image classification model being explained as , the interpretable decision tree model as , and the locality fidelity loss as , which is calculated by the locally weighted square loss:
The database is composed of perturbed samples which are sampled around by drawing nonzero elements at random. Given a perturbed sample , we recover the sample in the original representation and get . Moreover, where distance function is the distance of image and is used to capture locality.
We denote the decision tree explanation produced by TLIME as below:
The depth of decision tree is a measure of model complexity. A smaller depth indicates a stronger understandability of model . In order to ensure both local fidelity and understandability, formula (2) minimizes locality-fidelity loss while holding low enough. Algorithm 1 shows a simplified workflow diagram of TLIME. Firstly, TLIME gets the superpixel image by using a standard segmentation method. Then the database is constructed by running multiple iterations of the perturbed sampling operation. Finally, within the allowable range of prediction error, TLIME gets the minimum depth decision tree by using the CART method.
4 Experimental Results
In this section, TLIME and LIME explain the predictions of RandomForeset Classifier and Google’s pre-trained Inception neural network. We compare the experimental results of the two algorithms in terms of understandability, fidelity, and efficiency.
4.1 RandomForeset Classifier on MNIST database
The MNIST database is one of the most common databases used for image classification. It consists ofsmall grayscale images of handwritten digits. In this experiment, the image data is split into as the training set and as the test set. Table 2 shows the performance of the random forest classifier. For instance , the predicted top classe is (p=1.0). Figure 5 shows the decision tree explanations by TLIME.
Comparing with LIME, which can only provide a one-shot explanation, the decision tree structure by TLIME provides a more intuitive explanation. Figure 5 shows that if feature and feature exist, then the prediction probability is . Moreover, the tree structure can capture the interaction between features in the data. The importance of feature and is and , respectively, which tells us the feature makes a significant contribution to predicting the outcome. The prediction error is calculated to measure local fidelity. The prediction error of TLIME is , showing better fidelity than LIME with an error of .
Efficiency is highly related to the time necessary for a user to grasp the explanation. The runtime of TLIME is , which is faster than that of LIME - . Note that the runtime does not include perturbed sampling operation, which takes the same time for LIME and TLIME.
|Inception prob||pred prob||pred error|
4.2 Google’s Inception neural network on Image-net database
We explain the prediction of Google’s pre-trained Inception neural network on the image shown in Figure 4a. The top predicted classes are listed. Figures 4b, 4c, 4d show the superpixels explainations for the top predicted classes: , and respectively. The prediction provides reasonable insight into what the neural network picks upon for each of the classes. This kind of explanation enhances trust in the classifier. Moreover, Table 3 lists the prediction errors of TLIME and LIME. Table 4 lists the runtime of TLIME and LIME. We can conclude from the above results that under less time, TLIME not only has a better understandability but also has a higher fidelity than LIME.
We propose a decision tree-based local interpretable model-agnostic explanation (TLIME) for improving explainable AI. The goal of TLIME is to construct an interpretable decision tree model over the interpretable representation that is locally faithful to the oringal classifier. We compare TLIME and LIME in explaining the predictions of RandomForeset Classifier and Google’s pre-trained Inception neural network. Experimental results have shown that TLIME exhibits a better understandability and higher fidelity than LIME using less process time, which covers the ingredients of an ideal explainable AI model - understandability, fidelity, and efficiency.
Understanding deep features with computer-generated imagery. See DBLP:conf/iccv/2015, pp. 2875–2883. External Links: Cited by: §1.
-  (2017) Interpretable explanations of black boxes by meaningful perturbation. See DBLP:conf/iccv/2017, pp. 3449–3457. External Links: Cited by: §1.
2015 IEEE International Conference on Computer Vision (ICCV), Vol. , pp. 1440–1448. External Links: Cited by: §1.
-  (2016) Harnessing deep neural networks with logic rules. See DBLP:conf/acl/2016-1, External Links: Cited by: §1.
-  (2009) The elements of statistical learning: data mining, inference and prediction. In https://web.stanford.edu/ hastie/ElemStatLearn/, Cited by: §2.
-  (2017) Identifying unknown unknowns in the open world: representations and policies for guided exploration. See DBLP:conf/aaai/2017, pp. 2124–2132. External Links: Cited by: §1.
-  (2016) Unsupervised learning on neural network outputs: with application in zero-shot learning. See DBLP:conf/ijcai/2016, pp. 3432–3438. External Links: Cited by: §1.
-  (2019) Interpretable machine learning: a guide for making black box models explainable. In Lulu, 1st edition, March 24, 2019; eBook, Cited by: §3.1.
-  (2017-06) Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 39 (6), pp. 1137–1149. External Links: Cited by: §1.
-  (2016) Model-agnostic interpretability of machine learning. CoRR abs/1606.05386. External Links: Cited by: §1.
-  (2016) Why should I trust you?: explaining the predictions of any classifier. See DBLP:conf/kdd/2016, pp. 1135–1144. External Links: Cited by: §1, §1.
-  (2006) Learning interpretable models (phd thesis). In Technical University of Dortmund, Cited by: §3.1.
-  (2017) Grad-cam: visual explanations from deep networks via gradient-based localization. See DBLP:conf/iccv/2017, pp. 618–626. External Links: Cited by: §1.
Going deeper with convolutions.
2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Vol. , pp. 1–9. External Links: Cited by: §1.
-  (2018) Interpreting CNN knowledge via an explanatory graph. See DBLP:conf/aaai/2018, pp. 4454–4463. External Links: Cited by: §1.
Interpretable convolutional neural networks. See DBLP:conf/cvpr/2018, pp. 8827–8836. External Links: Cited by: §1.