Melody: Generating and Visualizing Machine Learning Model Summary to Understand Data and Classifiers Together

07/21/2020 ∙ by Gromit Yeuk-Yin Chan, et al. ∙ Universidade de São Paulo Capital One 0

With the increasing sophistication of machine learning models, there are growing trends of developing model explanation techniques that focus on only one instance (local explanation) to ensure faithfulness to the original model. While these techniques provide accurate model interpretability on various data primitive (e.g., tabular, image, or text), a holistic Explainable Artificial Intelligence (XAI) experience also requires a global explanation of the model and dataset to enable sensemaking in different granularity. Thus, there is a vast potential in synergizing the model explanation and visual analytics approaches. In this paper, we present MELODY, an interactive algorithm to construct an optimal global overview of the model and data behavior by summarizing the local explanations using information theory. The result (i.e., an explanation summary) does not require additional learning models, restrictions of data primitives, or the knowledge of machine learning from the users. We also design MELODY UI, an interactive visual analytics system to demonstrate how the explanation summary connects the dots in various XAI tasks from a global overview to local inspections. We present three usage scenarios regarding tabular, image, and text classifications to illustrate how to generalize model interpretability of different data. Our experiments show that our approaches: (1) provides a better explanation summary compared to a straightforward information-theoretic summarization and (2) achieves a significant speedup in the end-to-end data modeling pipeline.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 7

page 13

page 14

page 15

page 16

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

“All models are wrong, but some are useful.” The application of machine learning (ML) models, including deep learning neural networks, is prevalent in all aspects of human activities and nowadays the main driving force of technological advances such as self-driving cars, personal assistants and medical diagnoses. While there are more new models and architectures proposed to improve the accuracies of different tasks, the reason for such popularity also implies that there is no silver bullet for creating the best model. Hence, the creation of an ML model is a human-centric activity that involves lots of reasoning, brainstorming, and most importantly – understanding processes. Understanding how a model works on one’s own data improves the task performances in a holistic scope not only limited to the model design but also the data preprocessing and feature engineering steps. Yet, ML models nowadays introduce a challenging problem on their

interpretability. It becomes so complex to understand what the models have learned that using them as a black box could result in adversely affecting people’s safety, financial, or legal status [voigt2017eu].

Thus, explainable Artificial Intelligence (XAI) becomes an emerging research field where a lot of efforts have been devoted to extracting the logic behind how the models think when making decisions. Overall, these logical models focus on the usage of decision trees, rules, and instance-level feature importance to mimic or customize the behavior of the ML models [guidotti2018survey] so that people can understand how the model works through decision paths or scoring systems. In particular, instance-level feature importance explanations become more popular to explain sophisticated models. It produces an accurate local explanation as they only focus on a single instance. Such customization can even allow the explainer to be embedded in the ML model [bien2011prototype, chen2019looks, kim2014bayesian, li2018deep]. Thus, local explanations have been readily proposed not only in explaining tabular data classification [lundberg2017unified, ribeiro2016should]

but also complex deep learning tasks in natural language processing

[poerner2018evaluating]

and computer vision

[chen2019looks].

Of course, the ability to customize does not come as a free lunch. As the explanation is tailored towards an individual instance, the explanation model loses the advantages of providing aggregated explanations to generalize on the whole dataset. This limits its usage on providing simple textual information or visualization to describe how the model works. Such a limit, however, is where visualization techniques come in handy. We observe that most feature importance based explanations in current literature can be summarized [chandola2007summarization] into an explanation summary. The goal of summarization (Figure 1) is to find a compact description of the dataset with a minimum cost of information loss (i.e., information theoretic). In other words, it finds a explanation summary of the ML model, which is the groups of instances with similar explanations(colored regions) and the groups of features that are used to explain similar sets of instances (dashed lines). Thus, the summary is a compact global explanation that enables effective visualization to communicate a model’s general behavior. To fill the gaps of local explanation techniques on XAI tasks, we propose a scalable data summarization technique that only takes the generic form of the explanation information into account so that we can leverage the existing explanation techniques on different domains to provide useful visual data summaries. In , we show that our implementations helps establish a holistic workflow for XAI experience concerning tabular data, texts or images. In short, our contributions are as follows:

  • [noitemsep,topsep=0pt,leftmargin=3mm]

  • , a scalable algorithm that generates a compact data summary for an ML model and input data. It takes any generic feature importance based explanations from the model and works for both structured and unstructured data. The algorithm consists of (1) an information-theoretic model to determine the best data summary and (2) an efficient sketching technique to speed up the computation. In Section 7.2 we show that produces meaningful results and scales to large data.

  • , an interactive system for scalable interpretation and exploration of the input data and the ML model together. By leveraging our algorithm to group similar instances and explanations, we enable a seamless workflow that connects different needs regarding the global, local, and class explanations in the current XAI systems [liao2020questioning].

  • Three use cases covering ML model interpretations on tabular data, image, and text. We demonstrate that our algorithm and system enhance the XAI user experiences on model interpretability to three mainstream data analysis.

Figure 2: A workflow illustration of how reducing the granularity of global explanation and increasing the granularity of local explanation comprehend the analysis on the machine learning model and the dataset. An explanation summary that groups similar features and instances opens the opportunity of addressing different tasks in explainable machine learning.

2 Related Work

To facilitate human understanding towards complex models through visualization, research mainly focus on visualization on three aspects: model internals, logics induced from the models, and instance level feature vectors that describe the behavior of the model.

2.1 Visualization of Model Internals

Visualization has been applied readily to understand and interact with deep learning neural networks. In fact, a survey about deep learning visual analytics by Hohman et al. [hohman2018visual] has listed more that 40 representative works in this area in the last 5 years. We encourage the readers to read the survey paper for a deeper investigation to the subject.

The simplest form of a neural network can be represented by a node-link diagram in which each node represents a neuron and link presents a connection weight between two neurons

[tzeng2005opening]. As nowadays the ways neurons are connected become more sophisticated and opaque, various visual analytics approaches have been developed to understand different properties of the networks. RNNVis [ming2017understanding] and LSTMVis [strobelt2017lstmvis]

address the understanding of recurrent neural network (RNN) by visualizing the bipartite relationship between hidden memories and input data and hidden memory dynamics with parallel coordinates respectively. Autoencoder is addressed by Seq2SeqVis which proposes a bipartite graph visualization to visualize the attention relationships between input and its possible translations to enable model debugging

[strobelt2018s]

. Another type of popular models for image classification is Convolutional Neural Networks (CNN). CNNVis

[liu2016towards], Blocks [bilal2017convolutional], AEVis [liu2018analyzing] and Summit [hohman2019s] are graph visualizations that aggregate similar neurons, connections, and similar activated image patches to convey learned visual representations from the model.

Besides visualizing the structures of a neural network, there are visual analytics systems that assist the model development processes in the industry. ActiVis [kahng2017cti]

is a visual analytics system used by Facebook to explore industrial deep learning models. Google has developed TensorFlow Graph

[wongsuphasawat2017visualizing] and what-if tool [wexler2019if] to help developers understand and test the behavior of different ML models.

The work in these criteria mainly addresses the visual analysis for model developers who have sufficient knowledge of the methodologies of their models. However, a more general AI tool requires the assessment and involvement of end-users, decision-makers, and domain experts. Addressing the needs of border XAI user experience, our work focuses on providing general explanations of ML models to users without requiring them to know the architectures.

2.2 Visualization of Logical Models

Logical models like decision trees [craven1996extracting] or rules [martens2008decompositional, yang2017scalable] can address the interpretation of complex models by using them to infer an approximated model from any ML models. Given a set of test data, the original model gives the predictions and the logical models use them as labels to train another classifier. The resulting classifier can be used to mimic the behavior of the original model while providing good interpretability to the users. Through visualizing logical models, users gain knowledge of the model’s capability.

Rule Matrix [ming2018rulematrix] is proposed to build and visualize the surrogate rule list to understand the model’s behavior by interacting with the rules. Gamut [hohman2019gamut] uses generalized additive models to construct a set of linear functions for each feature in the dataset to understand models through line charts. TreePOD [muhlbacher2017treepod] and Baobabview [van2011baobabview] visualizes the decision trees with different metrics incorporated for model understanding. iForest [zhao2018iforest]

visualizes random forests with data flow diagrams to interact with multiple decision paths.

For complex model, using logical models to explain the complex model is the consideration of fidelity – the accuracy of the explanation on the original model. It creates an additional layer of performance concerns. Therefore, local explanation methods are proposed to explore the possibility to provide accurate explanations or even be embedded in the original model training process. Yet, they only return results for an instance and do not consider a global explanation to the whole dataset, our work addresses the challenges of visually constructing a global view for local explanations.

2.3 Feature Vector Visualization

Local explanation models give feature scores for each instance. The features can be the features from original data [lundberg2017unified, ribeiro2016should, shrikumar2017learning] or a set of external information like concepts [kim2018interpretability] or training data [chen2019looks, li2018deep, ming2019interpretable]. Visual analysis can be directly applied to interact with the features [ming2019protosteer] or the feature vectors can be visualized as a matrix where rows represent the instances and columns represent the features [sawada2019model].

Besides, the data comes out from a deep neural network can appear as embeddings such that the linear distances between vectors represent their similarities as the model’s rationale. The main visualization technique to understand these feature vectors is projection [grover2016node2vec, li2018embeddingvis, liu2017visual, pezzotti2017deepeyes, rauber2016visualizing, xiang2020interactive]

. Treating the embedding as high dimensional data projection techniques such as tSNE, MDS, or PCA are applied to discover semantic groups inside the dataset from the resulting scatterplot. Users can assess the ML model and refine the original data from the brushing the filtering interactions in the projection.

Our technique identifies the scalability and usability challenges in the existing visualization. The projection technique mainly suffers from cluttering and the lack of feature information in the visualization which is crucial for a comprehensive explanation. aims at providing compact representations of both data and features so that visual information is more precise. Also, we address the needs of explanation exploration with different granularity by the proposed analytic workflow illustrated by , providing new ways to extend the powerful local explanations to scalable visual analytics.

3 Tasks Analysis of XAI Systems

Before we propose our methodology to generate an explanation summary for feature importance based explanations, we first review the taxonomy of XAI tasks to induce the reasons how can a visual explanation summary of ML model help. By understanding the tasks, we can consolidate the design considerations to expand our techniques into an effective user interface. To explain the tasks and the use of data summary systematically, we use a simple workflow of XAI (Figure 2) to connect the essential relationships among three main XAI tasks [liao2020questioning]: Global, Local, and Class explanations.

T.1 Global Explanation. The goal is to understand the overall weights of features used by the model to explain how AI makes decisions on the dataset in general. For example, imagine we have a model that tells what animal does an image contains. To understand the model, the first question a user might ask is what features the model uses to make a prediction? An XAI technique might give us a sorted list of features based on their influences to the model (Figure 2.1) – it tells that “skin color” is the most important factor.

T.2 Class Explanation. Understanding how and whether the model works in each class allows users to understand the decision boundaries in a smaller granularity to develop insights. For example, from a global explanation, “eyes” are used to predict many cats. “Eyes” and “cat” are the key information to understand the model rationale in a local region.

T.3 Local Explanation. For verification and inspections in full details, users need to inspect all explanatory features of a predicted instance (Figure 2.3). For example, why is there a cat predicted as “dog”? The difference between instances’ behavior can evaluate important decision boundaries. We can know that the image’s cat has white skin, which is a “dog” feature by inspecting a single image.

Usefulness of Explanation Summary. First, it can provide a global explanation with a better granularity (Figure 2.2). Instead of aggregating the whole dataset to rank the features, it tells directly that “eyes” are used on many cats while “ears” are used on many dogs. This answer avoids the mirage of aggregated features over different subsets. Also, clustering instances allows users to go from local to global explanation. For example, by browsing a cat image and knowing it has a wrong prediction due to its white skin, we might want to know all the cat images with the “white skin” feature will be predicted as “dog”. By inspecting all the cat images or images that have “white skins” (Figure 2.4), we go back to the inspection of a group of images again.

In detail, the tasks in the above workflow generalize the studies that consolidate the key user requirements of model explainability [adadi2018peeking, carvalho2019machine, gilpin2018explaining, guidotti2018survey, mohseni2018survey, ras2018explanation]. Furthermore, there is a plenty of empirical studies on the requirement of XAI from industry practitioners [amershi2019guidelines, boukhelifa2017data, hohman2019gamut, holstein2019improving, liao2020questioning, muller2019data, rule2018exploration]. They provide good empirical evidence from real experts to outline the guidelines for designing XAI systems. The details of T.1-3 derived from these studies are provided in Appendix 9.2.

4 Machine Learning Model Explanation Summary

In this section, we describe the definition of a model explanation summary as well as the algorithms to compute it from the local explanations.

4.1 Generic Representation of Local Explanation

The most generic form of a local explanation is a feature vector with total number of explanatory features used in explaining the whole dataset. Each value in the feature vector is the explanation importance of feature on the instance (e.g. “skin color” on a cat image). For more details of local explanation techniques, we redirect readers to Appendix 9.1. All instances’ explanations from the whole dataset thus can be expressed as a real-value matrix . One important property of the matrix is that it is sparse i.e. where is the number of nonzeros in . It ensures the explanation to use a small number of features to explain the model behavior so that the decision logic would not overwhelm the user. Also, to simplify the discussion afterwards, we assume and 111All matrices can satisfy these properties with a min-max scaler (on its absolute values if the sign does not matter)..

Data Type Explanatory Features
Tabular
A set of logics (ranges).
e.g. It will rain since .
Image
A set of common visual representations.
e.g. It is a pig 🐷 since I notice its nose 🐽.
Text
A set of topics.
e.g. This review is positive since there are
words like excellent / fantastic / amazing.
Table 1: Intrinsic nature of explanations among tabular, image and text data. It affects how we construct the explanation matrices.

4.2 Explaining Tabular, Image, and Text Instances

Given a generic form of instance explanation, we now drill down to an in-depth discussion of how these feature vectors can be applied to explain tabular, image, and text instances. Although all of them result in explanation matrices, their intrinsic nature, shown in Table 1, affects how we modify our modeling pipelines to construct the features (i.e., columns) to acquire meaningful explanations. We provide end-to-end data modeling examples in Appendix 9.3.

Tabular Data. Logical models like decision trees and random forests discretize the attributes in the dataset to a set of logics. Similarly, the explanatory features should be not only the original attributes but also the different ranges for better diversity. For example, all cities (instances) rain based on their precipitations (attribute), but how each city rain in different percentages of precipitation (ranges) reveals different climates.

Image Data. Users normally classify images with the common visual features among the same entity (e.g., stripes in zebras). Similarly, an image in an ML model can be explained with representative image patches collected from the original data that unify the reasoning process with a limited set of features instead of pixels in each image.

Text Data. Multiple documents are usually explained with common topics instead of single phrases because similarly meant data can be totally different words. For example, “good” and “great” represent similar sentiments. Thus, the explanatory features should be a set of topics instead of words to avoid an overly sparse matrix.

4.3 Problem Definition

The information-theoretic goal for summarizing the explanation matrix is to group similar instances and explanatory features simultaneously that balance compactness and information loss. Let and be the set of row (instance) and column (feature) vectors in respectively such that

is equivalent to a joint distribution between

and (i.e. ). Our goal is to find the optimal row and column clusters and so that it presents the explanation summary in Figure 1. Therefore, the first question is, how we should measure the information loss? For example, consider the following synthetic explanation matrix below:

It is obvious to group the rows into two clusters: , and the columns into two clusters: , . The information theoretic definition of the resulting compression and the approximation matrix recovered from the compression are as follows [dhillon2003information]:

Each entry in the approximation matrix is calculated as follows:

(1)

For example, . Thus, the compression loss can be expressed with metrics such as Kullback-Leibler (KL) divergence of from :

(2)

Yet, we observe a shortcoming of directly using KL divergence on the whole matrix. The in Equation 2

tells us that each entry’s contribution to the result is not independent to the clusters that it does not belong to. Therefore, we propose a loss function

such that each entry’s loss is marginal to its row and column cluster:

(3)

Such a marginalization prevents entries with high values dominating the calculation result, which we will demonstrate the subsequent improvement in Section 7.2.

Once we quantify the information loss, the next challenge is how should we choose the number of row and column clusters? If we do not cluster any rows and columns at all, will equal to zero. Whereas if we only have one cluster, the loss will be huge. Yet, neither of them is a summary of the data as it either represent the original matrix or a summary with poor quality. To automatically determine the optimal partitions, the idea is to use Minimum Description Length Principle (MDL), which states that the best model is the one that minimizes the total description length of the expression: (i.e., number of clusters and ) (i.e., information loss ). Putting them all together, we can now write the total cost function as:

(4)

which we try to minimize it with the best rows and columns partitions. and are user defined parameters to penalize large number of clusters. Users can increase the values to produce fewer clusters.

4.4 The Algorithm

We now present our (MachinE Learning MODel SummarY) algorithm. In the previous section we have created our goal to find the row and column clusters that minimize the cost function in Equation 4 among all possible number of clusters and all possible rows and columns combinations. Yet, the equation itself does not tell us how to reach the solution efficiently. Since the matrix can be considered as a graph where each entry is a weighted edge between a row node and a column node, we can use graph summarization [navlakha2008graph] approach to provide a baseline solution (Algorithm 1). The overall idea is as follows:

  1. Each row and column starts in its own cluster. Then, we put the row and column clusters into two separate lists (line 1-2).

  2. We first fix the column cluster assignment. For the row clusters in the list, we randomly select a row cluster (line 5).

  3. We compare the selected row cluster with the remaining row clusters in the list as merge candidates (line 7-12): we try merging the selected cluster with each remained cluster and calculate the cost reduction by Equation 4 (line 8). We choose the candidate that produces the least cost.

  4. If merging the selected cluster and its best candidate reduces the total cost, then we merge two clusters in the list (line 14-15). Otherwise, we remove the selected cluster in the list (line 17). Either way, the list will have one fewer item.

  5. We repeat steps 2-4, but we fix the row clusters and merge the column clusters instead. The whole algorithm stops until there are no clusters remained in both lists.

Input :  -- instances and explanatory features
, -- regularization terms
Output :  -- row and column clusters
R ,   /* intialize rows */
C ,   /* intialize columns */
RC   /* initialize loss function */
1 while  and  do
         /* randomly extract a cluster */
2       0, undefined for r R do
3             if  then
4                   , r
5             end if
6            
7       end for
8      if  then
               /* merge two clusters */
9            
10      else
             .push()   /* push the cluster to final result */
11            
12       end if
      /* same procedure as for C... */
13      
14 end while
Algorithm 1 (MachinE Learning MODel SummarY)

Overall, in every iteration, a row (column) needs to measure the cost reductions with the remaining candidates in the list, which has the maximum size of (). Therefore, the time complexity of the basic algorithm is . As a quadratic algorithm is infeasible for any moderately sized data for exploratory visual analysis, we now propose a speed-up strategy to make our algorithm suitable for interactive performance.

4.4.1 Speed Up Strategies With Data Sketches

While a randomized bottom-up algorithm scales linearly, Algorithm 1 is time-consuming as it needs an extra loop to compare all possible row or column clusters (line 7) in every iteration. However, if we look at the example matrix in Section 4.3, it is obvious that the first two rows (columns) are completely different from the last two rows (columns). Comparing candidates that are different indeed is of no use since they are unlikely to reduce the total cost. Thus, to speed up the algorithm, we propose a k-nearest neighbor query strategy with a novel use of locality sensitive hashing (LSH) [charikar2002similarity] scheme to encode a row of column clusters. LSH defines a family of hash functions (i.e., sketches) for a vector

so that the probability of hash collisions between two vectors is proportional to their euclidean distances (i.e.,

). Vectors with similar values thus can be stored in the same buckets in an LSH table. Furthermore, we can extend this proportional to retrieve similar row (column) clusters. If two clusters have many similar vectors, then the number of hash collisions will be high. Therefore, the top-k clusters from the query will likely be similar neighbors.

The query algorithm is illustrated in Algorithm 2. First, an LSH table needs to be built for rows and columns, respectively. Then, when a neighbor query is performed, we can use the hash keys from the query’s vectors to perform a table look-up to retrieve all the collided entries with the entries in the cluster ( subroutine in line 2). We count the average number of collisions between the entries from the query cluster and the ones from the candidate clusters (line 5) and return the top k clusters with the highest number. This can drastically reduce the number of comparisons and the running time when the matrices are large (Section 7.2).

4.4.2 Strategies Addressing Skewness and Sparsity

Empirically, we observe two challenges when computing the results from real datasets in which we provide the following heuristics to address the problems and demonstrate the effectiveness in Section 

7.2:

Smoothing the explanation values: When an explanation model assigns values to important features of an instance, the values can be very high (e.g., extremely sensitive features). It could affect the calculation of loss function (Equation 3) and prevent instances with similarly activated features from being grouped. Therefore, to have an even data distribution in the explanation matrix, we set the maximum value to the knee point of the overall value distributions in the matrix using a knee finding algorithm[satopaa2011finding].

Pre-clustering for a cold start in a sparse environment: Given a sparse explanation matrix, the bottom-up approach might face difficulties in cluster entries when the cost function is stuck a local minimum. Also, as the matrix is sparse, it is hard for the algorithm to know whether there are cluster structures at the beginning. These adversely affect the formation of significant clusters. To address this cold start problem, we reference from spectral graph partitioning [dhillon2001co] to create relatively smaller partitions of rows and columns using their singular vectors from SVD decomposition. Then, we can use our information-theoretic objective function to compress the matrix further.

/* Initialize and after line 2 in Algo. 1 */
/* Replace R with in line 7 of Algo. 1 */
Input : v -- query cluster
V, -- remaining clusters and LSH table
-- number of neighbors
Output : knn -- top k nearest neighbors
  /* initialize counter */
  /* get collided entries */
1 for   do
2       for  in V do
3             if  in  then /* collision between the clusters */
4                  
5                   break
6             end if
7            
8       end for
9      
10 end for
Algorithm 2 Top-k Nearest Neighbor Query
Figure 3: Explanation Summary Visualization Design. A: The data flow visualizes the data flow from each class to different instances clusters, providing a sense of overall data distribution among different model decisions. B: The summary adjacency list presents each instance cluster as a row, and each explanatory feature cluster as a color. The explanation values inside each co-cluster are visualized as a gradient generated in (1). C: The legends show the mappings between the features and the color encodings of their respective feature clusters.

5 Design Considerations

Based on Section 3 and Section 4, we distill the main design considerations for an interactive visualization interface for addressing a holistic XAI workflow and the summary’s characteristics. Considerations in [lavenderblush]pink address the tasks in Section 3 and those in [lavender(web)]blue address the data perspective in Section 4.

  • [noitemsep]

  • [lavenderblush]Visual Summary Synthesize instance and feature summary. Clusters of instances and clusters of features should be displayed together to understand the decision boundaries(knowledge of a model) on different subsets (the influenced population).

  • [lavender(web)]Sparse Summary Scalable visualization for large sparse data. As the local explanations are highly customized and independent, the explanation summary will also be a sparse matrix. The visualization needs to highlight small but significant co-clusters.

  • [lavenderblush]Prediction Outcome Display instances’ outcomes from the model. Knowing when and where the model can fail is essential to understand its capability. Thus, the prediction outcome should be embedded in the visual summary and explanations.

  • [lavenderblush]Filtering Filtering data summary by classes or features. When a user collects insights from local or class explanations, the insights need to be verified on a larger population. Thus, filtering by classes or features act as a query from a local analysis to refine the explanation summary in a global view.

  • [lavender(web)]Level-of-detail Different level-of-detail presentations for tabular, image and text data. Level-of-details may come in different forms for different data primitives. Although the explanation summaries are the same, when users drill down on details, the presentations should be different.

  • [lavenderblush]Explanation in a Loop Connecting local, global, and class explanations as a loop. The main three themes of XAI should be connected for a complete ML model explanation (Figure 2). Different views related to different scopes of explanations should be tightly integrated.

6

Based on our design considerations in Section 5 and our algorithm in Section 4, we present , an interactive system for helping users to understand an ML model’s decisions on an input dataset 222The system can be accessed at http://128.238.182.13:5004/. The interface consists of (A) a main explanation summary visualization, (B) an original data subset view from a selected summary, and (C) an instance view. In the following discussion, we will focus on the main summary visualization and how an XAI workflow in Figure 2 is established in a visual analytics fashion.

6.1 Explanation Summary Visualization Design

The explanation summary visualization (Figure 3) contains three visual components: the data flow, the adjacency list, and the legends. The dataflow shows how instances from different classes flow to different instance clusters through a Sankey diagram. The adjacency list displays the data summary from the local explanations. The legends display the features and their corresponding color encodings in the adjacency list.

6.1.1 Adjacency List

The main visual component of is the adjacency list of the explanation summary. The explanation summary is a matrix of two sets: instance clusters and feature clusters. The intersection between an instance cluster and a feature cluster is a real-valued submatrix of original explanations. Therefore, the simplest way to present the explanation summary is to directly show the original explanation matrix with rows and columns ordered according to their cluster memberships. However, we found the co-clusters hard to be observed when the matrix is sparse (C.2). Since an ML model’s decisions are usually diverse on different subsets of the input data, the clustering will also result in many different row and column clusters. Thus, it becomes difficult for users to notice small clusters. Also, we found the information obtained from the matrix hard to memorize when users perform multiple visual inspections and interactions on different widgets at the same time. For example, when a feature cluster is selected, users inspect the features inside in a separate view. After the inspection, it becomes difficult to recall which feature cluster they have selected inside the matrix. These problems related to sparsity and stimulus have also been identified and thoroughly studied in previous literature [ghoniem2005readability, hlawatsch2014visual, okoe2018node].

To explore relevant instances and features in a large sparse matrix (C.1-2), we design an adjacency list visualization (inspired by [hlawatsch2014visual]) to present the explanation summary (Figure 3B). Each row in the adjacency list represents an instance cluster, and each color texture represents a feature cluster. The size of an instance cluster is encoded with text and height. For a feature cluster, the size is encoded with width. Thus, each intersection between the instance and feature cluster forms a block (i.e. a cell in ). The blocks in each row are sorted by their values in . In this arrangement, we fix the instances’ positions for users to locate a subset of data easily. Also, the features are color encoded so that users can reference an explanatory feature easily by its color, which helps navigate the features across different widgets (C.6). Furthermore, as the column position restriction is removed, the adjacency list becomes more compact. We acknowledge that categorical color scheme might impose a scalability issue, thus we combine the colors with textures to increase the available selections.

Visualizing Local Explanation Values Each block is a co-cluster between a group of instances and features. Thus it is also a sub-matrix of the original explanation matrix. As a sub-matrix contains a distribution of positive real numbers, we display such information as a histogram encoded by a color gradient (Figure 31⃝). The values in the sub-matrix are sorted from high to low and then encoded by a sequential color scheme. The sorting can provide better clarity on the quality of co-clusters under the sparse matrix clustering condition (C.2).

Figure 4: Information loss of three datasets’ explanation summaries after applying the heuristics from left to right (with final loss reduction shown).

6.1.2 Data Flow

To provide a picture of how data and predictions are arranged in the summary, a Sankey diagram is displayed (Figure3A) on the left of the adjacency list. A vertical rectangle is shown for each class with height encoded as the number of instances in the dataset. The amount of fill of each rectangle is proportional to the number of instances in the currently shown summary. The horizontal flows represent the portion of data falling into a designated instance cluster. Different colors in the flows represent the amount of data that is either correctly predicted (grey) or incorrectly predicted (red). It helps users assess the capability of the ML model: its performance on each class and the accuracy of each different decision boundaries (C.3).

6.1.3 Legends

The legends (Figure 3C) show the color and texture encodings of the clusters of explanatory features. The features are sorted based on their existence in the current summary. When we click on a feature, its distribution of explanation values in the dataset is shown as a histogram (Figure 32⃝) to allow the inspection of its global importance (C.1).

6.1.4 Interactions

The explanation summary can be filtered through various mechanisms. Besides explicitly selecting classes and explanatory features for filtering in the dropdown menus, the statistical properties such as the size of clusters and the average explanation values of a co-cluster allow the summary to be filtered through sliding different thresholds. Also, when clusters are selected, the values in the subset are shown in a parallel coordinates to export the important instances from a sparse cluster (C.2) to the subset view through brushing the axes.

6.2 Visual Analytics Workflow of ML Model Explanation

We now describe how to leverage the explanation summary to complete a visual analytics workflow. The interactions between different explanations in Figure 2 are consistent with ’s views. While the adjacency list acts as a global overview of the ML model explanation, the components in the list can be selected and exported to a more focused class and instance inspection. In return, the adjacency list can use the findings from local explanations for verification or further insights. Thus, the workflow in the system forms a finite state transition among global, subset (class), and instance explanations. The discussion below mainly focuses on how the system helps circulate different XAI tasks.

6.2.1 Global Subset (Class) Explanation

After exploring the adjacency list, users can proceed to a subset of the clusters by clicking on a row cluster or a co-cluster(zoom and filter). After selecting an explanation subset and extracting the instances with significant values, users can proceed to understand the local decision logics from the behavior of instances inside. To provide contextual explanations for tabular, image, and text data, we propose three different ways to visualize the subsets (C.5).

Tabular. The system visualizes the tabular data in multiple sets of parallel coordinates (Figure 5D). Each set of parallel coordinates represents one class, and the lines inside represent the instances. The axes show the features in the original dataset, and the selected features are positioned at the front. The lines are colored based on whether their predictions. There are also two histograms on each axis that represent the distributions of the correctly and incorrectly predicted instances.

Image. For image data, the system shows the similarly explained instance on one column and their corresponding common visual representations on another column (Figure : Generating and Visualizing Machine Learning Model Summary to Understand Data and Classifiers TogetherB). The instances shown are also grouped by their classes and are surrounded by colored frames that indicate the predictions. All instances’ and features’ images are displayed to acquire a visual impression of similar images and explanations.

Text. The system shows the number of selected instances as bar charts grouped by class and prediction outcome on the left column, and the topics and words that are used to explain the instances on the right column (Figure 6C). Users can understand what kinds of words are combined to make decisions on each prediction and further select individual words inside each topic to filter the bar charts. When a bar is clicked, the documents can be exported to the local explanation view.

6.2.2 Subset (Class) Local Explanation

After a subset of instances and features are inspected, users can drill down to inspect an instance with full details for insights or hypotheses (detail on demand). Similarly, different arrangements are provided to inspect instances from tabular, image, and text data (C.5).

Tabular Instances. The instances are selected by brushing the parallel coordinates in the subset view and rendered in the data table with original features to browse the exact numerical and categorical values. The color of each cell represents the prediction outcome.

Single Image. An image and its top influencing features (image patches with the highest similarity scores) are displayed. The instance and features also have their bounding box of neuron activations to inspect the relationship between different patches.

Text Documents. The full documents selected from the bar charts are shown. The words that are explanatory features in the document are highlighted by a sequential color map with their explanation values.

Figure 5: Use case of understanding a neural network of credit risk classification trained on tabular data. A, A1, A2: The explanation summary of the whole data, counterfactual of the query, and similar neighbors from the query, respectively. B: Explanatory features with value distributions to understand the popularity among the dataset. C: Explanation details for filtering and zooming significant explanations. D, D1, D2: Subset views of the selected subsets from the summary.

6.2.3 Local Global Explanation

Users might formulate insights and hypotheses throughout the top-down inspection. For testing the hypotheses, the explanatory features and instances in the local explanation panel are clicked, and their values become the conditions for filtering the adjacency list (Query) (C.4). For tabular data, when a cell in the data table is clicked, the logic that includes the cell’s value will be included (e.g., when a categorical cell valued “education” in column “purpose” is clicked, the explanatory feature “purpose = education” will be selected). For image and text data, the user clicks on the image or document for class queries and the image patches or words for feature queries. Overall, users can filter the explanation summary by class, prediction outcome, and explanatory features. As a result, a new and refined overview summary is available to perform global explanation tasks again, which completes the loop.

7 Evaluation

To evaluate the scalability and the quality of , we perform quantitative experiments and use case scenarios on a variety of datasets.

7.1 Experimental Setup

The implementations are written in NumPy, and the experiments are run in a MacBook Pro with 2.4 GHz 8-Core Intel Core i9 CPUs and 32GB RAM. We use the following real-world datasets and ML models to conduct our experiments and use cases:

Caltech-UCSD Birds-200-2011 Images. The dataset includes 11,788 images with 200 species of birds. We use a Convolutional Neural Network (CNN) with a prototype layer [chen2019looks] and achieves the highest test accuracy of . The explanation matrix is extracted from the prototype layer, which has 1330 visual explanatory features.

Home Equity Line of Credit (HELOC). It contains binary classifications of risk performance (i.e., good or bad) on 10,459 samples with even class distributions. We train a two-layer neural network and achieves the highest test accuracy of . We extract 167 logics and use SHAP [lundberg2017unified] to construct our explanation matrix.

US Consumer Finance Complaints. The dataset contains 22,200 documents with ten classes (e.g., debt, credit card, and mortgage). We train an LSTM neural network model and achieve the highest test accuracy of . We use IntGrad [sundararajan2017axiomatic] to generate explanations for words in each document. We further combine the words by clustering their embeddings to generate 500 topics as the explanation features.

Algorithm Tabular Image Text
Baseline 33 mins 21 mins hours
Baseline + LSH 5s 13s 9s
Table 2: Run time on different datasets.

7.2 Quantitative Evaluation

End-to-end quality evaluation. To evaluate how each of our heuristics improves the quality of the summarization results, we report the quality (information loss) of the baseline implementation (i.e., straightforward minimization of Equation 2) as well as the effects of applying marginalization (Equation 3), smoothing, and pre-clustering from Section 4.4.2. Overall, the heuristics significantly improve the quality of the result (Figure 4). The final reductions of information loss range from 78% to 99%. To visually understand the quality of the summarization results, we provide visual outcomes of the explanation summaries in Figure 8-10 in the Appendix.

Effect of data sketches on run time performance. We report the effect of the run time on the three datasets with the speedup strategies (Algorithm 2) in Table 2. The result clearly shows that by replacing the quadratic computation in the baseline approach (Algorithm 1), it becomes possible to produce results in interactive time. We also observe that the calculation of information loss is not linear in runtime since there are lots of data slicing operations to compute the approximation matrix (). The results highlight the importance of limiting the number of candidate comparisons in the bottom-up process.

Figure 6: Use case of understanding a neural network of document classification trained on text data. A: The explanation summary for customer loan complaints after filtering by class in 1⃝. B: By selecting a row cluster in 2⃝, the details of explanation values are shown for users to select significant instances and features by brushing in 3⃝. C: The subset view displaying the distributions of documents in each class related to the explanatory topics. D: Clicking a blue bar in 4⃝ shows all the correctly classified documents that are explained by the selected topics. 5⃝ Clicking the words formulates a query that extracts all documents predicted by the words in the model.

7.3 Use Cases

We present a usage scenario of understanding deep neural networks related to image recognition and two use cases regarding tabular and text classifications of financial data. Our goal is to demonstrate that our technique generalizes different ML model interpretation challenges in understanding the models and datasets.

7.3.1 Usage Scenario: Understanding an Image Classifier

We first describe a hypothetical walkthrough of understanding what a deep learning model has learned from a set of images (Figure : Generating and Visualizing Machine Learning Model Summary to Understand Data and Classifiers Together). We use images as examples because the visual presentations are intuitive to understand. Imagine Chris, an ornithologist, wants to study how birds’ appearances distinguish their species. He downloads the data and runs the ML model to understand how the machine learns the visual features.

Understand the summary. Chris uses to generate an explanation summary consisted of 37 instance clusters and 49 feature clusters. He imports the result to . After filtering small clusters and clusters with low explanation values, Chris discovers three broad groups of birds with similar prediction logics (Figure : Generating and Visualizing Machine Learning Model Summary to Understand Data and Classifiers TogetherA). Each group contains different visual explanatory features (i.e., color blocks), so he decides to go through the instance groups one by one.

Inspect an interesting subset. Chris clicks the text box on the row to select all the instances and features from the instance cluster for a detailed inspection. From the subset view (Figure : Generating and Visualizing Machine Learning Model Summary to Understand Data and Classifiers TogetherB), he realizes that the neural network learns to group birds with similar colors (yellowish birds) (Figure : Generating and Visualizing Machine Learning Model Summary to Understand Data and Classifiers TogetherB⃝1) for a coarse level of decision making. The images then are further classified by more detailed image patches such as the bird’s head and belly. Chris notices some classes such as yellow-throated Vimeo have many wrong predictions (images with orange frames) in this subset. Therefore, he clicks on some of the images to examine an image and its classification logics in full detail.

Develop hypotheses by inspecting an instance. Chris checks an image by clicking it in the subset view. The image and its top explanatory features thus are shown in the instance inspection view (Figure : Generating and Visualizing Machine Learning Model Summary to Understand Data and Classifiers TogetherC). He sees a yellow-throated Vimeo is wrongly predicted as a blue-winged warbler because the furs on its neck look similar to the ones of a blue-winged warbler. Chris finds the whole process enjoyable since he quickly identifies the reasoning processes of the model on hundreds of images within a simple journey of visual analysis.

7.3.2 Tabular Use Case: Understanding the Data Capability

We now present a use case about approaching the limit of predictability in training a dataset. Understanding how the current features help to make predictions allows the financial worker to make improvements to the current credit system.

Understand the summary. After filtering by value threshold and the number of instances in the clusters, the analyst obtains a visual explanation summary (Figure 5A). It shows that the blue-colored blocks occupy most of the rows. It consists of mainly items related to delinquency (Figure 5B). Then, he clicks and inspects the subsets and filters some low explanation values by brushing the parallel coordinates (Figure 5C), the subsets show very similar behaviors: for customers who have no history of delinquency, the model labels them as “good”.

Discovering more detailed logics in the model. The analyst sees that such logics provide an approximated accuracy of around in more than half of the population. To understand how a “bad” decision is correctly predicted with a good delinquency record, he refines the summary by the classes. The summary shows another logic that influences the model outcome (Figure 5A⃝1

). The pink blocks represent the features related to a low external risk estimate, which means that the customer would still be graded as “bad” if the external risk estimate is low (Figure 

5D⃝1).

Verifying insights. From the dataflow, the analyst sees that combining delinquency and risk estimate yields a good prediction result. By verify this hypothesis, he filters the summary by showing only the wrong predictions under the same condition. By adjusting the value threshold to a low extent, the wrong predictions mainly attribute to the fact that they do not have a low risk-estimate (i.e., missing pink blocks related to risk estimate when blue blocks are presented) (Figure 5A⃝2). Clicking the rows with missing pink blocks also reveals that the model fails to identify bad risk when the customer has a good delinquency record and high external risk estimates (Figure 5D⃝2). Throughout the visual analysis in different granularity, the analyst acquires an overview of the model: the model mainly decides by the history of delinquency on the first level of reasoning, then further screen out the bad risks by low external risk estimates. The query panel shows that the rows explained by either these two logics cover more than of the whole dataset.

7.3.3 Text Use Case: Predicting Customer Complaints

We present a use case of exploring a text classification model to understand different types of customer complaints. Understanding how customers complain can improve the call center’s services. Our financial analyst first uses to acquire 28 instance clusters and 23 feature clusters. Then, as he is from the loan division in the company, he filters the explanation summary by “customer loans” class to explore the customers’ inquiries related to loans (Figure 6(1)).

Identify useful subsets. The analyst first discovers that the explanation summary is very sparse (Figure 6A). Therefore, he clicks on the block to examine a more detailed view of the explanation subset (Figure 6(1)). The details of value distributions in the selected block are shown in the explanation parallel coordinates (Figure 6B). The analyst discovers that the sparsity mainly comes from the low usage of many topics in the feature clusters. Thus, he brushes the axes of topics that have high values to acquire the subset of instances and topics that heavily correlate to each other. The result of the brushing is shown in the subset view (Figure 6C).

Discover interesting topics in the subset. From the subset view, the analyst discovers that many complaints are related to words such as ”auto”, ”bmw”, and ”ford”. These words belong to automobiles and vehicles. Given the longer length of correctly labeled (blue) bars in the bar chart, these words contribute significantly to the correct prediction of customer loan complaints to the model. Therefore, by clicking the blue bar, the analyst inspects the raw documents of these automobile-related instances (Figure 6D).

Insights from instances. By browsing the documents, the analyst confirms that the loan payments complaints are related to vehicle purchases. By clicking the words like “vehicles” and “cars”, he queries the global explanation summary to verify his findings (Figure 6(5)). The queried results show that there are more than 120 complaints about customer loans that contain such phrases with the correctness of 93%. Thus, he concludes that automobile purchase is a popular topic when customers approach the financial institution. The company should these topics included during the call center training sessions.

8 Conclusion and Future Work

In this work, we present , an interactive algorithm to construct an explanation summary of an ML model from local explanations of a dataset. The summary allows users to understand the decision rationale and data characteristics together for a holistics XAI experience. With the algorithm, we also present , an interactive visual analytics system to connect different granularity of XAI tasks. The versatility of our algorithm and system enable scalable visual explorations of generic ML model interpretations on tabular, image, and text data. Our future work includes:

Embed summarization to model training processes. Instead of generating a summary after the training, we plan to integrate the summary as a layer in the deep neural network to increase the global explanation capability of the model.

User study. Since many model developers use visualizations such as partial dependency plot or projections to understand the model, we plan to conduct a user study to see if providing explanatory features and similarity among data at the same time will improve any productivity in practice. Also, we plan to conduct a longitudinal evaluation of to ML researchers to investigate how the system affects model design, data engineering, and model debugging.

Application domain. Apart from tabular, image, and text classifications, there are also other data primitives such as time series and graph classifications tasks. We plan to explore the visual analytics approach to apply our algorithm to explain ML tasks in these domains.

References

9 Appendix

9.1 Background of Local Explanation Models

We provide a background of the mainstream models that generate local explanations of a machine learning model’s decisions to a dataset. The popularity of giving local explanations, except applying logical models such as decision trees or rules, is because these methods provide an independent and highly customized explanation for each instance. When explanations do not aggregate into general decisions or rules, they become more faithful to the original model.

In general, to generate a local explanation for an instance, explanation algorithms usually seek one of the following approaches:

  1. Local Linear Models: The algorithm searches the neighbors of an instance, then fits the subset to a linear model such that the higher the gradient of a feature in the linear model, the more important the feature is to the prediction of the selected instances. SHAP [lundberg2017unified] and LIME [ribeiro2016should] are the examples that use neighbors to evaluate an instance.

  2. Perturbation: Instead of using other instances to generate explanations, one can perturbate the values of its attributes and observe whether the output changes significantly after removing, masking or altering them. The sensitivity of each feature implies that its value lies in the decision boundary of the machine learning model. Thus, a sensitive feature from perturbation has a high influential power on the instance. This method has been applied to Convolutional Neural Networks (CNNs) for image classification [zeiler2014visualizing].

  3. Prototype: The intuition is to use representative original training data (i.e. prototype) to explain a prediction, which can be selected by clustering the latent representations in the ML model. Given the class labels of each prototype and their similarities with the input data, the prediction and reasoning process becomes a scoring system where the class with the highest score (i.e. the sum of similarities of prototypes belonging to the class) is the returned result. This technique has been readily incorporated in deep neural networks for image, text and sequence predictions [chen2019looks, li2018deep, ming2019interpretable].

  4. Backpropagation

    : Since complex models like neural networks contain series of propagation of weights from the input to the output neurons to produce predictions, one can invert the process to backpropagate the neurons with great gradients from the output to the input data locate the portion of original data that causes the neuron activations in the output. Such a portion implies the meaningful features that explain the model’s decision. Saliency Maps

    [simonyan2014deep], DeepLIFT [shrikumar2017learning] and Intgrad [sundararajan2017axiomatic] are the examples of such methods.

9.2 Tasks Breakdown of Explainable AI

In general, ML models explainability is achieved by three different types of tasks (T): Global (general behavior of a model), Local (behavior of a model to an instance), and Class (behavior of a model to a class) [guidotti2018survey, liao2020questioning]. Liao et.al. [liao2020questioning] in addition provides actionable suggestions (A) for each task. For each task and action, we identify different opportunities (O) that an explanation summary can help achieve the tasks.

  • [noitemsep]

  • Global Explanation. The goal is to understand the overall weights of features used by the model to explain how AI makes decisions on the dataset in general.

    • [noitemsep,topsep=0pt,leftmargin=6mm]

    • Users select the important features that affect the whole dataset’s outcome to uncover data-related issues such as data collection, bias, and privacy.

    • They may also evaluate the limit and capability of the model. By inspecting the main features, users can develop a mental model to interact with or improve the system.

    • [noitemsep,topsep=0pt,leftmargin=6mm]

    • An explanation summary can define the appropriate level of details to explain the model without losing too many details nor overwhelming the users. Grouping the relevant features and instances allow interactions for users to prioritize different information shown at a time.

    • Subsetting the information by similarity also decreases the complexity of the explanation since the instances and features shown have many common properties. This allows the global explanations visualized to be more representative.

  • Local Explanation. The goal here is to inspect the model’s behavior on a specific instance and understand how the instance’s properties influence the outcome.

    • [noitemsep,topsep=0pt,leftmargin=6mm]

    • A popular activity is to explore different what-if scenarios. Users observe the outcome if some features become different which helps to explore more scenarios of applying the model and gain insights into the model’s capability.

    • Another action is to directly understand why does the instance belong to a prediction and why not does it result in other outcomes. This helps to discover the local decision boundaries of the model.

    • Providing the original input/data provides a more holistic system capability to understand a particular decision and accommodate users’ understandings and interactions.

    • [noitemsep,topsep=0pt,leftmargin=6mm]

    • Grouping similar instances provide neighbors of the instance that are explained similarly by the model, which increases the number of instances to support users’ insights and findings. Can we conclude that “ears” are important in the prediction of ”cats” from what we see on a single image? We also know that if there exists lots of cat images exhibiting similar characteristics. An explanation summary thus allows a large set of instances to be analyzed to avoid spurious conclusions [wu2019errudite].

    • Similar to grouping instances, grouping features allows users to prioritize important features that explain an instance and its neighbors, which reduce the cognitive workload when deriving understandings to the model’s decision logic.

  • Class Explanation (Counterfactual). How a prediction (class) works in the model is also an important emphasis. It is similar to a global explanation but with a smaller granularity on a specific class. Yet, the actions to understand a class are more similar to instance explanations, which focus on the sensitivity of features to each prediction.

    • [noitemsep,topsep=0pt,leftmargin=6mm]

    • Testing the sensitivity of features towards a prediction is equivalent to the test of different what-if scenarios. By testing different ranges of features, users can understand the decision boundaries of a prediction class.

    • Besides interaction, the exploration of the relevant features of a prediction also helps understand why and why not cases of a prediction to gain insights into the decision logic.

    • With groups of similar instances and features, users can apply different levels of details to acquire more precise subsets.

    • Extending the findings of an instance to its similar neighbors inside a class increases the confidences of the insights.

9.3 End-to-End Explanation Modeling Pipeline

In this section, we describe the example pipelines in handling tabular, image, and text data that result in explanation matrices with the explanatory features in Table 1. We explicitly categorize the pipeline with preprocessing, ML modeling, and explanation modeling stages. Notice that they are not the only ways to achieve the objectives of data engineering. The explanation models can be interchanged as well. In addition, we provide an synthetic example to illustrate how the whole explanation process works, as well as our goal to summarize the whole explanation data in Figure 7.

9.3.1 Tabular Data

Preprocessing.

To enable logics as the explanatory features for tabular data, we need to preprocess the original data into one-hot encodings of logics under each attribute. For numerical and ordinal data, the attributes are first discretized into different quantiles. Then, the one-hot encoding can be applied to transform the quantiles into separate columns, where 0 indicates the data does not fall into the ranges while 1 indicates it does. The way of discretizing attributes can be as straightforward as choosing a fixed number of equal intervals or leveraging the statistical properties such as entropy. In our use case, we use Sturge’s rule to determine the number of quantiles and the ranges of quantiles are determined by the training data. The one-hot encoding can also directly be applied to categorical attributes.

ML modeling. Then, the transformed data is used to train a neural network so that the logics are the input features. This allows the logic to be evaluated in the explanation methods.

Explanation Modeling. As the input features of the ML model are a set of logics, methods such as LIME and SHAP can be directly applied to the model and dataset to generate feature vectors composed of a set of logics.

9.3.2 Images

Preprocessing. For images, we do not need much feature engineering as the explanatory features are the pixels themselves. We only need to apply standard image augmenting techniques (i.e. replicating training images with scaling, rotating, and mirroring) to increase the training data size for a better model accuracy.

ML modeling. We apply prototype learning inside a Convolutional Neural Network [chen2019looks]. It adds a prototype layer on the last layer of the original neural network. The training process results in a selection of a fixed number of image patches from the training data as prototypes that are used to reason the prediction of new data.

Explanation Modeling. As the explanation model is already incorporated as a layer in the ML model when new data comes in, an explanation matrix can be constructed, where is the number of tested data and is the number of prototypes.

9.3.3 Text

Preprocessing. Similar to images, the explanation of text comes from the texts inside the documents as well. Thus, we only need to apply standard text preprocessing steps like removing stopwords and infrequent words to make sure the explanation models do not return explanations with meaningless topics.

ML modeling. We can use common text models such as RNN and LSTM to generate predictions. Notice that usually the first layer of these models are the word embeddings of the whole dataset. We can leverage this word embeddings to find extract the topics in the dataset by clustering based on them.

Explanation Modeling. For training the ML model, we can examine each word’s importance to the prediction by gradient-based explanation models such as DeepLIFT and Intgrad. This results in an extremely sparse matrix where each feature is a word that appears in more or equal than one documents. Also, words with similar meanings such as “good” and “excellent” will be treated as different features. To densify the explanation matrix so that similar words are grouped and more significant hidden structures can be produced, we can transform the local explanation from a feature vector of words to a feature vector of topics. The explanation importance of each topic to an instance can be determined by the maximum explanation importance among the words in the topic. Such allows words with similar semantics to be grouped before the matrix is summarized.

Figure 8: Explanation summary matrix of the HELOC dataset’s explanation matrix. Rows represent the instances and columns represent the explanatory features. Vertical lines represent row clusters and horizontal lines represent column clusters. The color reflects the explanation values.
Figure 9: Explanation summary matrix of the Caltech-UCSD Birds-200-2011 Images’ explanation matrix. Rows represent the instances and columns represent the explanatory features. Vertical lines represent row clusters and horizontal lines represent column clusters. The color reflects the explanation values.
Figure 10: Explanation summary matrix of the US Consumer Finance Complaints’ explanation matrix. Rows represent the instances and columns represent the explanatory features. Vertical lines represent row clusters and horizontal lines represent column clusters. The color reflects the explanation values.