Explaining the Predictions of Any Image Classifier via Decision Trees

11/04/2019 ∙ by Sheng Shi, et al. ∙ 94

Despite outstanding contribution to the significant progress of Artificial Intelligence (AI), deep learning models remain mostly black boxes, which are extremely weak in explainability of the reasoning process and prediction results. Explainability is not only a gateway between AI and society but also a powerful tool to detect flaws in the model and biases in the data. Local Interpretable Model-agnostic Explanation (LIME) is a recent approach that uses a linear regression model to form a local explanation for the individual prediction result. However, being so restricted and usually oversimplifying the relationships, linear models fail in situations where nonlinear associations and interactions exist among features and prediction results. This paper proposes an extended Decision Tree-based LIME (TLIME) approach, which uses a decision tree model to form an interpretable representation that is locally faithful to the original model. The new approach can capture nonlinear interactions among features in the data and creates plausible explanations. Various experiments show that the TLIME explanation of multiple blackbox models can achieve more reliable performance in terms of understandability, fidelity, and efficiency.

READ FULL TEXT VIEW PDF

Authors

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

In recent years, the fast-growing computing power, enormous consumer and commercial data, and emerging advanced machine learning algorithms jointly stimulate the prosperous of AI

[3][9], which has gone from a science-fiction dream to a critical part of our daily life. Compared to traditional machine learning methods, deep learning has achieved superior performance in perception tasks such as object detection and classification. However, because of the nested non-linear structure, deep learning models usually remain black boxes that are particularly weak in the explainability of the reasoning process and prediction results. In many real-world mission-critical applications, transparency of deep learning models and explainability of the model outputs are essential and necessary in their real deployment process.

Explanable AI is not only a gateway between AI and society but also a powerful tool to detect flaws in the model and biases in the data. The development of techniques on explainability and transparency of deep learning models has recently received much attention in the research community [6][4][16][15]. The relevant research roughly falls into two categories: global explainability and local explainability. Global explainability aims at making the reasoning process wholly transparent and comprehensive [7][1], while local explainability focuses on extracting input regions that are highly sensitive to the network output to provide explanations for each decision [11][10][2][13].

An effective way to achieve explainability is to use a light-weight function family to create interpretable models. Local interpretable model-agnostic explanations (LIME) identify an interpretable model over the human-interpretable representation that is locally faithful to the original model [11]. LIME adopts the linear regression as its interpretable function, which represents the prediction as a linear combination of a few selected features to make the prediction process transparent. However, being so restricted and usually over-simplifying the relationships, linear regression models fail in some situations where non-linear associations and interactions exist among features and prediction results

In this paper, we propose a Decision Tree-based Local Interpretable Model-agnostic Explanation (TLIME). The decision tree structure creates good explanations as the data ends up in distinct groups that are often easy to understand. Moreover, the tree structure can capture interactions between features in the data. We perform various experiments on explaining two black-box models, the random-forest classifier and Google’s pre-trained Inception neural network

[14]. The results show that decision tree explanations achieve more reliable performance than original LIME in terms of understandability, fidelity, and efficiency.

2 Interpretable models

Using a subset of algorithms from a light-weight function family to create interpretable models is an effective way to achieve interpretability. In this section, we analyze two representative interpretable models - the linear regression model and the decision tree model. Table 1 shows the properties of two interpretable models. The linear regression displays the prediction as a linear combination of features, while the decision tree represents the reasoning process in a hierarchical structure, which is suitable for capturing the nonlinear association between features and predictions. The monotonicity constraint shown in both models is necessary to ensure the consistency between a feature and the target outcome. Moreover, the decision tree model can automatically capture the diverse interactions between features to predict the target outcome, applicable to both classification and regression tasks.

Models Linearity Monotonicity Feature Interaction Task
Line regression Yes Yes No Regression
Decision trees No Some Yes Classification, Regression
Table 1: The properties of linear regression model and decision tree model

Depending on the different criteria, various algorithms are capable of constructing a decision tree. The CART [5]

is the most popular algorithm which can handle both classification and regression tasks. In this paper, we mainly construct regression decision trees to explain the prediction probability of the image classifier. Figure 

1 illustrates a simple regression tree to explain image classification prediction made by Google’s Inception neural network. The predicted top class label is . The highlighted superpixels give intuition as to why the model would choose that class. The decision tree shows that if feature , , and exist, then the prediction probability is , which is the mean value of the instances in this node. Moreover, The importance of the three features is , , and

, showing the contribution of the three features in improving the variance.

Figure 1: A simple regression decision tree explains the image classification results given by Google’s Inception neural network. The top output class label is with the prediction probability .

3 The TLIME Approach

3.1 Characteristics of TLIME

Despite the fact that the amount of research in explainable AI is growing actively, there is no universal consensus on the exact definition of interpretability and its measurement criterion [8]. Ruping first noted that interpretability is composed of three goals - accuracy, understandability, and efficiency [12]. We argue that fidelity is a better description than accuracy since accuracy is easily confused with the performance evaluation criteria of the original black box model. These three goals are inextricably intertwined and competing with each other, as shown in Figure 2. An explainable model with good interpretability should be faithful to the data and the original model, understandable to the observer and graspable in a short time so that the end-users can make wise decisions.

Figure 2: The three goals of Interpretability

TLIME has many appealing characteristics, such as interpretable, local fidelity, and model-agnostic. It provides a qualitative understanding of features and predictions. It is challenging, if not impossible, to be utterly faithful to the black box model on a global scale. TLIME takes a feasible approach by approximating it in the vicinity of an instance being predicted. Besides, TLIME, as a model-agnostic interpretation, shows excellent flexibility and capability of explaining any underlying machine learning model.

3.2 Explanation System of TLIME

Considering the poor interpretability and high computational complexity of the pixel-based image representation, we adopt a superpixel based explanation system. Each superpixel, as the basic processing unit, is a group of connected pixels with similar colors or gray levels. Figure 3 shows the pixel-based image, superpixel image, and superpixel-based explanation. The interpretable representation of an image consisting of pixels and

superpixels is a binary vector

where indicates the presence of original superpixel and indicates absence of original superpixel.

(a) Pixel-based image (b) Superpixel image (c) Superpixel-based explanation
Figure 3: Pixel-based image and superpixel image

We denote the original image classification model being explained as , the interpretable decision tree model as , and the locality fidelity loss as , which is calculated by the locally weighted square loss:

(1)

The database is composed of perturbed samples which are sampled around by drawing nonzero elements at random. Given a perturbed sample , we recover the sample in the original representation and get . Moreover, where distance function is the distance of image and is used to capture locality.

We denote the decision tree explanation produced by TLIME as below:

(2)

The depth of decision tree is a measure of model complexity. A smaller depth indicates a stronger understandability of model . In order to ensure both local fidelity and understandability, formula (2) minimizes locality-fidelity loss while holding low enough. Algorithm 1 shows a simplified workflow diagram of TLIME. Firstly, TLIME gets the superpixel image by using a standard segmentation method. Then the database is constructed by running multiple iterations of the perturbed sampling operation. Finally, within the allowable range of prediction error, TLIME gets the minimum depth decision tree by using the CART method.

0:  Classifier ; Number of samples ; Instance ; Max depth of tree ;
0:  time and prediction error of TLIME;
1:  get superpixel image by segment method;
2:  initial ;
3:  for ; ;  do
4:      get by sampling around ;
5:      get by classifier ;
6:      get by recovering ;
7:      
8:  end for
9:  for ; and ;  do
10:      get decision tree
11:      ;
12:  end for
13:  output decision tree , time and prediction error;
Algorithm 1 Decision Tree based Local interpretable model-agnostic explanation (TLIME)

4 Experimental Results

In this section, TLIME and LIME explain the predictions of RandomForeset Classifier and Google’s pre-trained Inception neural network. We compare the experimental results of the two algorithms in terms of understandability, fidelity, and efficiency.

4.1 RandomForeset Classifier on MNIST database

The MNIST database is one of the most common databases used for image classification. It consists of

small grayscale images of handwritten digits. In this experiment, the image data is split into as the training set and as the test set. Table 2 shows the performance of the random forest classifier. For instance , the predicted top classe is (p=1.0). Figure 5 shows the decision tree explanations by TLIME.

precision recall f1-score support
weighted avg 0.95 0.95 0.95 21000
Table 2: The performance of randomforeset classifier on MNIST database

Comparing with LIME, which can only provide a one-shot explanation, the decision tree structure by TLIME provides a more intuitive explanation. Figure 5 shows that if feature and feature exist, then the prediction probability is . Moreover, the tree structure can capture the interaction between features in the data. The importance of feature and is and , respectively, which tells us the feature makes a significant contribution to predicting the outcome. The prediction error is calculated to measure local fidelity. The prediction error of TLIME is , showing better fidelity than LIME with an error of .

Efficiency is highly related to the time necessary for a user to grasp the explanation. The runtime of TLIME is , which is faster than that of LIME - . Note that the runtime does not include perturbed sampling operation, which takes the same time for LIME and TLIME.


Figure 4: Explaining an image classification prediction made by Google’s Inception neural network. The top classes predicted are , ,

(a) Original image and prediction probability. (b) Explaining . (c) Explaining . (d) Explaining .

Figure 5: Explaining an image classification prediction made by randomforeset classifier. The top class predicted is
Inception prob pred prob pred error
TLIME 0.4309 0.0476
LIME 0.6264 0.1479
TLIME 0.4682 0.0479
LIME 0.6814 0.2611
TLIME 0.0191 0.0036
LIME 0.0168 0.0059
Table 3: The prediction probability and prediction errors of TLIME and LIME on Google’s Inception neural network
TLIME 0.0060s 0.0030s 0.0090s
LIME 0.0289s 0.0150s 0.0120s
Table 4: The Time of TLIME and LIME on Inception neural network

4.2 Google’s Inception neural network on Image-net database

We explain the prediction of Google’s pre-trained Inception neural network on the image shown in Figure 4a. The top predicted classes are listed. Figures 4b, 4c, 4d show the superpixels explainations for the top predicted classes: , and respectively. The prediction provides reasonable insight into what the neural network picks upon for each of the classes. This kind of explanation enhances trust in the classifier. Moreover, Table 3 lists the prediction errors of TLIME and LIME. Table 4 lists the runtime of TLIME and LIME. We can conclude from the above results that under less time, TLIME not only has a better understandability but also has a higher fidelity than LIME.

5 Conclusion

We propose a decision tree-based local interpretable model-agnostic explanation (TLIME) for improving explainable AI. The goal of TLIME is to construct an interpretable decision tree model over the interpretable representation that is locally faithful to the oringal classifier. We compare TLIME and LIME in explaining the predictions of RandomForeset Classifier and Google’s pre-trained Inception neural network. Experimental results have shown that TLIME exhibits a better understandability and higher fidelity than LIME using less process time, which covers the ingredients of an ideal explainable AI model - understandability, fidelity, and efficiency.

References

  • [1] M. Aubry and B. C. Russell (2015)

    Understanding deep features with computer-generated imagery

    .
    See DBLP:conf/iccv/2015, pp. 2875–2883. External Links: Link, Document Cited by: §1.
  • [2] R. C. Fong and A. Vedaldi (2017) Interpretable explanations of black boxes by meaningful perturbation. See DBLP:conf/iccv/2017, pp. 3449–3457. External Links: Link, Document Cited by: §1.
  • [3] R. Girshick (2015-12) Fast r-cnn. In

    2015 IEEE International Conference on Computer Vision (ICCV)

    ,
    Vol. , pp. 1440–1448. External Links: Document, ISSN 2380-7504 Cited by: §1.
  • [4] Z. Hu, X. Ma, Z. Liu, E. H. Hovy, and E. P. Xing (2016) Harnessing deep neural networks with logic rules. See DBLP:conf/acl/2016-1, External Links: Link Cited by: §1.
  • [5] T. H. Jerome Friedman and R. Tibshirani (2009) The elements of statistical learning: data mining, inference and prediction. In https://web.stanford.edu/ hastie/ElemStatLearn/, Cited by: §2.
  • [6] H. Lakkaraju, E. Kamar, R. Caruana, and E. Horvitz (2017) Identifying unknown unknowns in the open world: representations and policies for guided exploration. See DBLP:conf/aaai/2017, pp. 2124–2132. External Links: Link Cited by: §1.
  • [7] Y. Lu (2016) Unsupervised learning on neural network outputs: with application in zero-shot learning. See DBLP:conf/ijcai/2016, pp. 3432–3438. External Links: Link Cited by: §1.
  • [8] C. Molnar (2019) Interpretable machine learning: a guide for making black box models explainable. In Lulu, 1st edition, March 24, 2019; eBook, Cited by: §3.1.
  • [9] S. Ren, K. He, R. Girshick, and J. Sun (2017-06) Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 39 (6), pp. 1137–1149. External Links: Document, ISSN 0162-8828 Cited by: §1.
  • [10] M. T. Ribeiro, S. Singh, and C. Guestrin (2016) Model-agnostic interpretability of machine learning. CoRR abs/1606.05386. External Links: Link, 1606.05386 Cited by: §1.
  • [11] M. T. Ribeiro, S. Singh, and C. Guestrin (2016) Why should I trust you?: explaining the predictions of any classifier. See DBLP:conf/kdd/2016, pp. 1135–1144. External Links: Link, Document Cited by: §1, §1.
  • [12] S. Ruping (2006) Learning interpretable models (phd thesis). In Technical University of Dortmund, Cited by: §3.1.
  • [13] R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra (2017) Grad-cam: visual explanations from deep networks via gradient-based localization. See DBLP:conf/iccv/2017, pp. 618–626. External Links: Link, Document Cited by: §1.
  • [14] C. Szegedy, Wei Liu, Yangqing Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich (2015-06) Going deeper with convolutions. In

    2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

    ,
    Vol. , pp. 1–9. External Links: Document, ISSN Cited by: §1.
  • [15] Q. Zhang, R. Cao, F. Shi, Y. N. Wu, and S. Zhu (2018) Interpreting CNN knowledge via an explanatory graph. See DBLP:conf/aaai/2018, pp. 4454–4463. External Links: Link Cited by: §1.
  • [16] Q. Zhang, Y. N. Wu, and S. Zhu (2018)

    Interpretable convolutional neural networks

    .
    See DBLP:conf/cvpr/2018, pp. 8827–8836. External Links: Link, Document Cited by: §1.