Multi-objective Explanations of GNN Predictions

by   Yifei Liu, et al.
Lehigh University

Graph Neural Network (GNN) has achieved state-of-the-art performance in various high-stake prediction tasks, but multiple layers of aggregations on graphs with irregular structures make GNN a less interpretable model. Prior methods use simpler subgraphs to simulate the full model, or counterfactuals to identify the causes of a prediction. The two families of approaches aim at two distinct objectives, "simulatability" and "counterfactual relevance", but it is not clear how the objectives can jointly influence the human understanding of an explanation. We design a user study to investigate such joint effects and use the findings to design a multi-objective optimization (MOO) algorithm to find Pareto optimal explanations that are well-balanced in simulatability and counterfactual. Since the target model can be of any GNN variants and may not be accessible due to privacy concerns, we design a search algorithm using zeroth-order information without accessing the architecture and parameters of the target model. Quantitative experiments on nine graphs from four applications demonstrate that the Pareto efficient explanations dominate single-objective baselines that use first-order continuous optimization or discrete combinatorial search. The explanations are further evaluated in robustness and sensitivity to show their capability of revealing convincing causes while being cautious about the possible confounders. The diverse dominating counterfactuals can certify the feasibility of algorithmic recourse, that can potentially promote algorithmic fairness where humans are participating in the decision-making using GNN.



There are no comments yet.


page 1


Multi-Objective Counterfactual Explanations

Counterfactual explanations are one of the most popular methods to make ...

Reimagining GNN Explanations with ideas from Tabular Data

Explainability techniques for Graph Neural Networks still have a long wa...

Algorithmic Performance-Accuracy Trade-off in 3D Vision Applications Using HyperMapper

In this paper we investigate an emerging application, 3D scene understan...

Preserve, Promote, or Attack? GNN Explanation via Topology Perturbation

Prior works on formalizing explanations of a graph neural network (GNN) ...

Robust Counterfactual Explanations on Graph Neural Networks

Massive deployment of Graph Neural Networks (GNNs) in high-stake applica...

Explaining GNN over Evolving Graphs using Information Flow

Graphs are ubiquitous in many applications, such as social networks, kno...

Scalable Explanation of Inferences on Large Graphs

Probabilistic inferences distill knowledge from graphs to aid human make...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Graphs represent relations between entities and have been used to model social networks  [tang2009relational], biological networks [Marinka2017], and online reviews [rayana2015collective]. On prediction tasks on graphs, such as node classification, link prediction, and graph classification [kipf2017gcn, chen2018fastgcn, velivckovic2018gat, hamilton2017graphsage], GNN exploits the relations to aggregate information in a neighborhood of each node to achieve state-of-the-art predictive performance. However, the aggregations over many nodes multi-hops away make the GNN predictions too opaque to be understood and trusted by humans. Explanations of the GNN predictions try to simplify the computation to deliver societal merits, such as justifying the predictions, fulfilling legal regulation [Goodman2017], and algorithmic recourse [Ustun2019, Russell2019, Barocas2020fat]. For example, when warning an online shopper about frauds detected using GNN on a review graph [rayana2015collective], the user may ask “why I am a victim of frauds” and expect an explanation such as “the website you’re viewing has connections with certain suspicious IP addresses”. We focus on explaining GNN predictions made on graph nodes.











The proposed

Human Evaluation
TABLE I: Prior explanation methods v.s. the proposed method.

According to [Lewis1986], “To explain an event is to provide some information about its causal history” and the explanation of a prediction can be defined in two ways. First, an explanation can be a causal chain consisting of a forward mapping from inputs and model parameters (the “causes” ) to model prediction (the “outcome” ) via steps of computations. An explanation with good simulatability would allow humans to more easily forward simulate the causal chain (possibly a simplified version). Second, an explanation can be in the form of counterfactuals [Miller2019]: an event is said to have caused event , if in the counterfactual where did not happen, would not have happened. Counterfactuals allow humans to see the impact of on , and prior works show that humans do counterfactual reasoning in their day-to-day life [Miller2019, Binns2018chi]

. We define counterfactual relevance as the amount of change in the probability that

happens when the cause is altered. There are additional desiderata. To give humans a better sense of causal relationship, an explanation of an outcome should be robust to perturbations irrelevant to the cause but sensitive to changes in . Diverse counterfactuals for algorithmic recourse with minimal changes in the decision subjects, e.g., a human on a dating site, allow human agency in the decision-making [Ustun2019].

Explaining GNN is gaining more attention, and yet there is no study of the interactions between the two metrics, simulatability and counterfactual relevance, from the human and computation perspectives. Gradient-based methods [baldassarre2019explainability, pope2019explainability] use magnitudes of gradients to highlight important edges or node features. Such methods aim at counterfactuals since the gradients indicate how fast the prediction (the “outcome”) changes with respect to small perturbations in the highlighted input (the “cause”). Learning-based explanation methods, including GNNExplainer [ying2019gnn] and GNNLIME [graphlime], extract a simpler surrogate model to faithfully approximate a GNN prediction and thus promote simulatability, without concerning counterfactual relevance. Explanation methods based on gradients [ying2019gnn] need to access the target model as a whitebox and may break privacy and security constraints. See Table I and related work for comparisons.

Inspired by the two modes of human thinking studied in psychology [kahneman2011thinking], we hypothesize that human perception of an explanation is a function of both metrics. We conjecture a cognitive process where humans first intuitively make sense of the outcome in a lightweight forward simulation using an explanation (System 1), and then perform more effortful counterfactual reasoning (System 2) to figure out a cause of the outcome. If the explanation is rejected due to low simulatability in the first phase, humans will be less willing to seek for the causes. Fig. 1 shows a pictorial representation of the hypothesis, with four categories of explanations. Gradient-based explanations are in the high counterfactual relevance, low simulatability category (region A), and GNNExplainer has no guarantee of high counterfactual relevance but aims to achieve high simulatability (region D). Table III in Section V shows the quantitative evaluation of these methods.

Fig. 1: Simulatability and counterfactual relevance interact.

To test the above hypotheses on GNN, we adopt simple (small, acyclic, and connected) subgraphs as explanations for forward simulation of a GNN prediction on a node. Each explanation is associated with a counterfactual explanation that has some elements removed from the explanation to flip the prediction. We generate explanations and counterfactuals in the four categories shown in Fig. 1 and measure how simulatability and counterfactual relevance interact to influence human perception of the explanations. Statistical analyses show that: 1) a low simulatability can, but not always, prevent the adoption of an explanation, making counterfactual reasoning less relevant. 2) conditioned on a high simulatability, high counterfactual relevance improves human acceptance of the explanation.

Given the joint effect of the two metrics on humans, current methods do not jointly maximize simulatability and counterfactual relevance. Since the two metrics can be competing and trade-offs are necessary, we define Pareto efficient explanations and formulate a multi-objective optimization problem to model the trade-offs. Since the target model is a blackbox, we design a depth-first search algorithm that accesses the zero-th order information of the model, i.e., the predictions, to identify Pareto efficient subgraph explanations. Explanation search algorithms, such as those based on (mixed) integer programming [Ustun2019, Russell2019] and subgraph enumeration [Yoshida2019kdd], employ similar searches, and yet they are single-objective optimization. Though less expensive, gradient-based approaches [ying2018graph] are white-box methods and only find node/edge importance, while the generation of connected graphs still requires exhaustive search. Further, we provide an analysis on the lack of robustness of gradient-based GNN explanations. In the contrast, we empirically verify the robustness and sensitivity of the optimal explaining subgraphs found by the proposed algorithm. Although we strike for causal explanations, we are cautious and formulate GNN using Structural Equation Model (SEM) to prove that confounders can exist in a subgraph explanation and users must be cautioned that the found counterfactuals are not “the” causes of the predictions. Lastly, we extensively verified that the proposed algorithm dominates single-objective baselines in both metrics on 9 datasets.

Ii Problem Formulation

Assume that we have a GNN of layers trained to predict class distributions of the nodes on a graph , where is the set of nodes and is the set of edges connecting the nodes. Let be the set of neighbors of . On layer , and for any node , , GNN computes using messages sent from to , by the following operations:


The MSG function computes the message vector sent from

to (e.g., ). The AGG function aggregates the messages sent from all to and can be the element-wise sum, average, or maximum of the messages. The UPDATE function uses parameter to map to . One example is

, followed by some non-linear mapping such as ReLU. The input node feature vector

for is regarded as . The output of the GNN on node is , which can be softmaxed to the node class distribution (a vector of class probabilities). The parameters of GNN, , , are trained end-to-end on labeled nodes on . We define an explanation of the prediction to be a subgraph of that contains the target node  [ying2019gnn]. Besides being agnostic to the above details of architecture and parameters, we desire the following properties of the explanations.

Simulatability. A comprehensible explanation should be simulatable, defined by the following two aspects. The simplicity of an explanation is related to the limit of human cognitive bandwidth [Miller1956] and sparsity is used as a proxy of simplicity [Du2019, Guidotti2018, ying2019gnn]. We say that the explaining subgraph is -sparse if contains no more than nodes. Due to the sparsity, does not allow full computation taken on the full graph , and the faithfulness of measures how much the can reproduce generated on . Similar to [Suermondt1992], we measure faithfulness using the symmetric KL-divergence between the prediction on and on (the larger, the better):


Counterfactual relevance. Let the above-defined subgraph be a “fact”. A counterfactual of is a perturbation of . We restrict the counterfactual to be a strict subgraph of . Let the difference between and be denoted by , the size of which is represented by , so that means adding to reconstructs . The class distributions of generated by the target GNN model on and are denoted by and , respectively. We define the counterfactual relevance [Miller2019] of the tuple when explaining as


can be positive, negative or zero. Because represents the faithfulness, the absolute measures the change in the class distribution of approximated by the fact and the counterfactual . When is large, the portion removed from is likely to be the cause of  [guo2020survey]. The normalizer makes sure that the same difference caused by a small will be more desirable than that caused by a larger . It also prohibits extreme counterfactuals that remove all nodes except the target . These quantities are demonstrated in Fig. 2.

Fig. 2: Two explanation metrics. is the GNN prediction of on the full graph , is the GNN prediction on , and on . Faithfulness is measured by Eq. (4). The smaller the , the more faithful. is circled by the dashed line. The larger the (Eq. (5)), the more counterfactual relevance.

Iii How Humans Perceive Explanations

“System 1 operates automatically and quickly …
System 2 allocates attention to the effortful mental activities …”

Daniel Kahneman, Nobel laureate

We conducted a human subject study to find the roles of the two metrics in the human perception of explanations. The two modes of thinking, System 1 and System 2, are extensively studied in psychology, as quoted above. We conjecture that forward simulations help humans quickly screen an explanation using System 1, while reasoning using the counterfactual is a more deliberate process that requires System 2, so that humans will conduct counterfactual reasoning only after the explanation has passed System 1 screening. Simulatability and counterfactual relevance measure how well an explanation and an associated counterfactual are received by the two Systems.

According to Fig. 1

, on the Cora dataset, we sample five target nodes and for each node we generate subgraphs with low and high simulatability. This leads to ten explaining subgraphs for each subject to evaluate the simulatability. For each explaining subgraph

, we further generate two counterfactuals that are subgraphs of , with different counterfactual relevance.

Fig. 3: Sample explanations in the human study. Each node is a paper on Cora. Left: the large graph containing , whose prediction is to be explained. Predicted distributions over 7 classes are shown in histograms. Right: subgraphs explaining the prediction of , along with the class distributions predicted on the individual subgraphs (top/middle: explanation/counterfactual found by GNN-MOExp, bottom: a counterfactual with a small counterfactual relevance. Counterfactuals are constructed by removing the dashed edges).

Fig. 3 shows one sample test case. For each of the five nodes, a subject will see the original graph where GNN produced the prediction , the explanation that produced , and two counterfactuals that generate two . The full graph is considered to be too complicated for interpretation, while is more intelligible. The two counterfactuals allow a subject to evaluate if a removed part is a plausible cause of the prediction . For each graph, we color the nodes based on the GNN’s prediction, so that a subject can relate a prediction to the neighbors. We show the predicted class distributions in histograms, so that the predictions across the (sub)graphs can be compared conveniently. The subjects were not told about the two metrics of the explanations but needed to understand, analyze, and then rate the explanations.

(a) Histograms of responses in 5-point Likert scale under four conditions, represented in the quadrants as those in Fig. 1. The top right quadrant has the highest acceptance.
(b) Regardless of counterfactual relevance, a higher (low) simulatability leads to a higher (lower) acceptance rate. A low simulatability can lead to low acceptance, though does not prohibit 4-5 points responses.
(c) Regardless of simulatability, a high counterfactual relevance makes the found cause (the portion removed from an explanation ) more convincing to human subjects.
Fig. 4: User study. Fig. 3(a) and  3(b) show that simulatability is necessary but not sufficient for explanation acceptance.

To avoid bias, we frame the survey as an evaluation of a graph-based search engine and recruited subjects with search experience using Google Scholar. The authors of this paper are excluded. The two counterfactuals are randomly ordered. Each subject is further trained on two additional sample cases. During the test phase, we ask subjects the following questions after each test case and collect feedback ( in the parentheses) in a 5-point Likert scale (1-very little (won’t accept),2-little,3-not sure,4-a little, 5-very well):

  1. [leftmargin=*,topsep=0pt]

  2. Simulatability (): How well do you think the second subgraph is reproducing the prediction computed in the first graph?

  3. Counterfactual-1 (): How much do you think the removed component in the third subgraph is an important factor leading to the histogram for the second subgraph, had it not been removed?

  4. Counterfactual-2 (): Same as above but replace the the third subgraph with the forth subgraph.

  5. Explanation acceptance (): How much will you accept the probabilities, if they were computed on the second subgraph rather than the first?

Iii-a Analysis of human feedback

The questions quantitatively reveal the human perception of the two explanation metrics. Let the responses to the questions a, b, c, and d be , , , and , respectively. measures the subject’s perceived simulatability of the explanation . The difference between and measures the preference of a subject between two alternative counterfactuals . measures the subject’s overall acceptance of

as an explanation based on its simulatability and the plausibility of the causes found using the counterfactuals. After filtering out an obvious outlier (the responses to all questions are the same), we have 10 subjects’ responses to 10 test cases, leading to 100 scores for each of the four questions. We draw the following conclusions based on statistical analyses.

High simulatability helps acceptance that can be boosted by high counterfactual relevance. Using responses

, a two-way analysis of variance (ANOVA) shows that the two metrics interact strongly (

-value ). Fig. 3(a) and 3(b) confirm that a high simulatability is a prerequisite of explanation acceptance, with high counterfactual relevance being the second condition. A low simulatability leads to more mixed acceptance, regardless of counterfactual relevance. There are some numbers of acceptance with low simulatability, due to the subjects’ in-depth analysis of the cases that leads to a final acceptance.

Simulatability can predict acceptance of explanations. We conducted a -test on the responses from two groups: one has cases with low simulatability and the other has cases with high simulatability. The -value is almost zero, indicating that the degree of acceptance differs significantly between the groups. The -statistic is . After taking into account the within-group variances and the sample size, we conclude that the acceptance of a less simulatable explanation is less than that of a more simulatable explanation. Fig. 3(b) further confirm this conclusion.

A higher counterfactual relevance makes a reason more likely perceived as “the cause”. While there can be several factors that jointly lead to the GNN prediction , humans tend to accept the one with high counterfactual relevance as “the cause”, compared to those with low counterfactual relevance. We conducted a -test between the responses and . The tests show that a higher counterfactual relevance is more convincing (all -value ), regardless of simulatability (see Fig. 3(c)). However, when simulatability is low, the presented “cause” is less convincing (see bottom two subfigures of Fig. 3(a)). Caution: “the cause” presented by a counterfactual may not be the only or the true cause of the prediction , due to confounders. See Section IV-B.

Iv Multi-objective explanations of GNN

Fig. 5: The workflow of finding Pareto optimal GNN explanations with high simulatability and counterfactual relevance. The subgraphs are enumerated by DFS and the two objectives are computed on the enumerated explanations and their counterfactuals for each node to be explained. Pareto optimal explanation that are high (but not necessarily the highest) in both metrics are selected.

Given the human study results, we aim to solve the following multi-objective optimization problem.

Definition 1.

Given a graph and a GNN model , on any target node , extract an explanation subgraph and a counterfactual subgraph , where , contains no more than nodes and is acyclic, so that and are maximized:


For simplicity of the explanation, we restrict to contain no more than nodes [Miller1956]. The limit to nodes also reduces the degree, coreness, and centrality of any nodes in , and improves human reaction time when reasoning with  [Lynn29407]. We restrict the explanations to be acyclic graphs [Vu2020PGMExplainerPG], since a cycle can lead to self-proof and explanations such as “ Alice is a database researcher because she cited a paper of Bob, who is a database researcher since he cited Alice’s paper”.

The optimization is bi-objective and the objective vector function has two scalar objectives. We don’t use a single scalar objective function, such as , not only because that can be hard to specify, but also that trading one objective for the other is not desirable according to the human subject study (either low simulatability or counterfactual relevance suppresses human acceptance of the explanation and the counterfactual). Beyond being multi-objective, the solution space of all possible , defined by the constraints in the above optimization problem, is exponentially large and discrete and no polynomial-time algorithm is known to search the space. The gradient-based methods in [ying2019gnn, pope2019explainability] and the search-based methods in  [Russell2019, Ustun2019, Mothilal2020fat, yuan2020xgnn, Vu2020PGMExplainerPG] can only maximize one of the objective functions and do not guarantee Pareto optimality, i.e., efficient trade-off between objectives. We follow the search-based explanation generation paradigm, but aim at finding the Pareto front and selecting one particular Pareto efficient explanation with well-balanced objectives.

Iv-a Search for Pareto optimal explanations

The algorithm, GNN-MOExp (Graph Neural Network Multi-Objective Explanations) is shown in Fig. 5. We first apply a depth-first search (DFS) to explore the space of subgraphs for . Since the prediction of the target does not depend on nodes that are more than hops away from , the search is restricted to the dependent neighbors. A canonical ordering of the edges is determined by a breadth-first search (BFS) before running the DFS, ensuring no subgraph will be enumerated more than once. The BFS also canonically numbers the nodes to avoid isomorphism test during graph lookup: the same graph will be represented by a unique array of edges with canonical node numbering. Starting from the subgraph containing only , the DFS expands the subgraph by adding an un-visited edge adjacent to the current subgraph. The constraints in Eq. (6) are used in pruning the search space. After all valid candidate subgraphs containing the edge have been explored, the edge is flagged and will not be visited in future. The enumeration will be completed when all edges within the neighborhood are processed.

The GNN model has to be run on each enumerated subgraph and the two metrics and are computed by Eq. (4) and Eq. (5). Since contains at most nodes, the cost is low. To avoid repetitive calculation of when calculating , a hash table is used to record for each subgraph. becomes a counterfactual of all subgraphs that are the descents of in the DFS search tree.

After evaluating each subgraph and its counterfactuals, we need to find the optimal explanation so that both metrics are high. However, the two metrics can be competing and it is hard to find an explanation that outperforms all others in both metrics. We aim to find Pareto optimal (efficient) explanations, that are optimal in the sense that it cannot be outperformed by another explanation in both metrics [miettinen1998nonlinear]. We need the following definitions.

Definition 2.

(Pareto dominance) Let and . , . If Pareto dominates , then , denoted as .

Definition 3.

(Pareto optimality). is Pareto optimal if and only if .

Definition 4.

(Pareto optima) The set of all Pareto optimal solutions: .

Definition 5.

(Pareto optimal front). The set consists of the function values of the Pareto optimal set: .

However, explanations on the Pareto front can be low in one objective while being high in another, and is thus not useful. We design a simple method to find Pareto optimal explanations that are: 1) dominating other explanations, and 2) likely simultaneously optimal in individual metrics (without guarantee). In particular, we sort the explanations and their counterfactuals along the simulatability and counterfactual relevance, independently. Let the ranking position of in the two rankings be denoted by and (the smaller the better). We define the comprehensive ranking be


Finally, we select the with the best comprehensive ranking, denoted by as the final explanation.

One possible baseline is to use the so-called preference vector to select a Pareto optimal solution that satisfies some weighted balance between the objectives [mahapatra20a]. We found this method hard to use in our case: the two objectives are of different ranges, which vary across different target nodes. In contrast, the ranking-based approach handles the heterogeneity. We did not present this baseline since it significantly underperforms our method. A more competitive baseline is to find whose rankings in the two objectives are well balanced. We compare our approach with this baseline in the experiments. Since the Pareto front is non-convex and contains dents that have well-balanced but low objective values, the above baseline may not work well.

The explanation chosen by the comprehensive ranking is in the Pareto front, as shown by the following theorem.

Theorem 6.

The ranking-based method finds a solution that’s on the Pareto front.


If is not a Pareto optimal solution, then there is that dominates . By definition, must be ranked higher than in at least one objective, while in the other objective the two are at least equal. According to the definition of comprehensive ranking, and would have been chosen by the explanation selection algorithm. ∎

Complexity of the Algorithm. Regarding the DFS, in the best case, is on one end of a linear chain and the time complexity is . In the worst case, the number of subgraphs of a complete graph with nodes is exponential, and the complexity is . Many real-world graphs are sparse and the complexity is more likely to be polynomial. The depth of GNN is usually limited () due to the over-smoothing effect of aggregation [Li2018DeeperII] and the number of nodes searched depends on the size of the -hop neighborhood of the target node. We show in Fig. 9 that the running time of the subgraph search is practically low.

It seems that one has to find the Pareto front and then use the comprehensive ranking to find the best explanation. To eliminate all dominated solutions, the time complexity is quadratic in the number of enumerated subgraphs. However, Theorem 6 says that the comprehensive ranking already points to a solution on the Pareto front and the overall time complexity is just linear in the number of enumerated subgraphs, using the heap data structure.

Fig. 6: Confounders in GNN. Left: a structural causal model with being the common cause of both and , and is a confounder that makes . Right: at the bottom, explains the prediction on node and node can be removed from as an intervention to obtain a counterfactual explanation . Above is a computation graph that represents the rollout of the structural equations (1)-(3). Arrows are dependencies among the nodes on the computational graph and dashed lines are not relevant to . The variable is a common cause of and and therefore confounds the effect of on through . There are other confounders, and the effect of the intervention on should be adjusted for all confounders.

Iv-B Confounders

Confounders are variables that impact both causes and outcome [Pearl2009]. Fig. 6 shows the concepts of confounder that leads to the Back-Door adjustment:


which is in general not the same as . For in the figure, the counterfactual explanation is obtained by the intervention of removing from . Humans may think that is “the cause” of the output . However, this is not true due to confounders, as shown in Fig. 6.

Iv-C Connection to Shapley values

There is a close relationship between counterfactual explanations and Shapley values  [shapley1953value, chen2018shapley]. As an explanation, Shapley values are the importance of the factors contributing to the predictions to be explained. One can consider the portion removed from a subgraph as a contributor, and by averaging ’s contributions over all possible that contain (denoted by ), we obtain the Shapley value of :


The contribution follows the definition of Shapley values and can be positive, negative, or zero. Instead, counterfactual relevance is always non-negative and gives the magnitude of the importance of .

Iv-D Robustness and sanity check of explanations

An accurate explanation of a prediction should vary according to the underlying mechanism that generates the prediction [Adebayo2018], and should remain the same under irrelevant perturbations [Ghorbani2017].

Fig. 7: Manipulate a GNN explanation. Left: original graph. Center: messages and cause the prediction on , while and are irrelevant. Right: and are rotated to perturb a gradient-based explanation, though the prediction of class remains the same.
Definition 7.

The robustness of a subgraph explanation is the degree of the change in under perturbations that are irrelevant to the mechanism that generates .

We assume a one-layer GNN () with parameter , where is the total number of classes to be predicted and is the number of features of the nodes. We use the graph in Fig. 7 Left to demonstrate the difference in the robustness of explanations found by GNN-MOOExp and prior gradient-based methods. Gradient-based methods [pope2019explainability, ying2019gnn, baldassarre2019explainability]

find explanations using the gradient of the following faithfulness loss function with respect to a mask

over the adjacency matrix :

where is the -th row of . As we are explaining a GNN prediction, is the predicted class and not necessarily the ground truth class of . is the input feature vectors of the neighbor of . The target GNN model will set all entries of to 1 so that all neighbors of are retained. In Fig. 7 center, the neighbors’ features satisfy so that the relevant neighbors to are just and , with representations and , whose sum is closer to than to for any . The gradient of w.r.t. is


The importance of the edge is the magnitude of the above gradient, essentially determined by the correlation between and . In Figure 7 center, since both and are orthogonal to , gradient-based methods will never have and in their explanations. When and are rotated so that is more similar to than while remains, the gradient-based explanation will include , even the prediction remains the same. The rotations are irrelevant to how leads to the prediction . On the other hand, is closer to than to or if only subgraphs with three nodes () are allowed. As a result, GNN-MOExp still finds the same optimal subgraph containing , and , even after the rotations and is thus more robust.

Another aspect is that an explanation should faithfully reflect how a changing is generated and is different from simulatability that focuses on explaining a static mechanism that generates a fixed . Formally,

Definition 8.

A sanity check of an explanation of a GNN model’s prediction verifies if changes when the mechanism that generates changes.

A sanity check is a necessary (but not a sufficient) condition for an explanation to be a faithful surrogate of the full model: not passing the sanity check indicates that an explanation is not reflecting the input-output relationship encoded by the GNN. When debugging a GNN model to identify whether the model or the graph data are manipulated or polluted, passing the sanity check means the explanations can reveal the malicious attacks to the model or data. The prior work [Adebayo2018] proposed a sanity check for deep neural networks on images and does not address sanity checks for GNN on graphs. We conduct sanity checks for GNN-MOExp in Section V-C.

V Experiments

Datasets Classes Nodes Edges Edge/Node Features
Cora 7 2,708 10,556 3.90 1,433
Citeseer 6 3,321 9,196 2.78 3,703
PubMed 3 1,9717 44,324 2.24 500
Musae-F 4 2,2470 342,004 15.22 4,714
Musae–G 2 37,700 578,006 15.33 4,005
Amazon-C 4 13,752 574,418 41.77 767
Amazon-P 6 7,650 287,326 37.56 745
Coauthor-C 13 18,333 327,576 17.87 6,805
Coauthor-P 2 34,493 991,848 28.76 8,415
TABLE II: Nine networks from four application domains.
Datasets Simulatability () Counterfactual Relevance ()
Cora -0.196 -0.252 -0.530 -0.243 -0.213 -0.272 -0.256 -0.108 -0.049 0.240 0.260 0.330 0.243 0.225 0.217 0.615 0.455 0.467
Citeseer -0.051 -0.054 -0.066 -0.050 -0.056 -0.058 -0.068 -0.044 -0.039 0.114 0.116 0.116 0.115 0.113 0.112 0.178 0.156 0.159
PubMed -0.081 -0.110 -0.365 -0.117 -0.086 -0.125 -0.129 -0.041 -0.010 0.112 0.129 0.200 0.117 0.100 0.099 0.330 0.235 0.248
Musae-F -0.972 -1.035 -0.899 -0.872 -0.911 -0.895 -0.346 -1.313 -0.199 0.613 0.653 0.438 0.546 0.576 0.520 0.696 1.260 0.806
Musae-G -0.118 -0.120 -0.693 -0.110 -0.144 -0.220 -0.030 -0.308 -0.005 0.112 0.119 0.527 0.118 0.126 0.126 0.247 0.366 0.213
Amazon-C -0.129 -0.126 -0.350 -0.134 -0.144 -0.175 -0.049 -0.298 -0.031 0.094 0.095 0.258 0.089 0.087 0.061 0.201 0.312 0.215
Amazon-P -0.163 -0.180 -0.458 -0.175 -0.203 -0.231 -0.058 -0.339 -0.034 0.122 0.132 0.315 0.123 0.111 0.090 0.257 0.377 0.277
Coauthor-C -0.216 -0.243 -0.745 -0.264 -0.245 -0.341 -0.097 -0.411 -0.038 0.183 0.205 0.568 0.214 0.184 0.184 0.268 0.457 0.263
Coauthor-P -0.146 -0.144 -0.720 -0.220 -0.159 -0.295 -0.057 -0.314 -0.035 0.133 0.141 0.534 0.149 0.138 0.167 0.208 0.367 0.206
TABLE III: Overall performance (the higher () the simulatability and the counterfactual relevance, the better). indicates the runner-up methods and indicates the best method certified by statistically significant

-tests (pairwise t-test at 5% significance level). The worst performances are underlined and the second-worst performances are under wave lines.

V-a Datasets and Baselines

Datasets and experimental settings. We drew real-world datasets from four applications for the node classification task. The dataset details are provided in the supplement.

  • [leftmargin=*]

  • In citation networks (Citeseer, Cora, PubMed) [kipf2017gcn], each paper has bag-of-words features, and the goal is to predict the research area of each paper.

  • We adopt Musae-Facebook (Musae-F) and Musae-Github (Musae-G) [rozemberczki2019multi] from social networks. Nodes represent official Facebook pages (or Github developers), and edges are mutual likes (or followers) between nodes. Node features are extracted from site descriptions (or developer’s location, repositories starred, employer).

  • Amazon-Computer (Amazon-C) and Amazon-Photo (Amazon-P) [shchur2018pitfalls] are segments of the Amazon co-purchase graph, where nodes represent goods, edges indicate that two goods are frequently bought together, and node features are the bag-of-words representation of product reviews.

  • Coauthor-Computer and Coauthor-Physics are co-authorship graphs based on the Microsoft Academic Graph from the KDD Cup 2016. We represent authors as nodes, that are connected by an edge if they co-authored a paper [shchur2018pitfalls]. Node features represent paper keywords for each author’s papers.

We randomly divide each graph into three portions with a ratio of training : validation : test = 50 : 20 : 30. The GNN is trained on the training set and all explanation methods are evaluated on the test set.

Baselines. We adopt the following baselines that generate subgraph explanations. Except the baseline Shapley, all baselines compute the weights of edges in the neighborhood of the target node . The explanation is generated by iteratively adding edges adjacent to the current subgraph until nodes are included in . The edges with higher weights will be considered first. The counterfactual of the baselines are generated in the same way as GNN-MOExp by trying different enumerated subgraphs. and of for each baseline are calculated by Eq. (4) and Eq. (5). We describe the details of the baseline:

  • [leftmargin=*]

  • Random (RND) assigns random weights to edges.

  • Embedding (EMB) uses DeepWalk [Bryan2014deepwalk]

    to embed the nodes, and the weight of an edge is calculated based on the cosine similarity between the embeddings of two nodes.

  • Gradient (Grad) [baldassarre2019explainability] use the magnitudes of gradients of GNN output w.r.t. edges to find salient subgraphs.

  • GAT [velivckovic2018gat] learns attention weights over neighbors of any node for message aggregation to predict the output of the GNN on , and the attention weights on the edges are extracted as edge weights.

  • GNNExplainer (GNNExp) [ying2019gnn] learns to mask edges so that the masked graph maximally preserve the predictions of , and the mask matrix provides the edge weights.

  • PGExplainer (PGExp) [luo2020parameterized] trains a deep neural network to parameterize the generation of explanations. The subgraphs generated by the explainer are evaluated.

  • Shapley picks and , defined in Eq. (9), with the highest counterfactual relevance, and use the selected and to generate the counterfactual .

  • GNN-MOExp-b (MOEB) is similar to GNN-MOExp, while the strategy is to select explanations that are most balanced in both metrics.

V-B Quantitative Results

Fig. 8: Parameters sensitivity of maximum subgraph complexity and maximum search distance on Citeseer.

Average simulatability and counterfactual relevance across all test nodes are reported in Table III. We conclude that:

  • [leftmargin=*]

  • Gradient does not perform badly in counterfactual relevance (best in three datasets and second places in 2 datasets), but it performs worst or the second-worst in simulatability except the Musae-F dataset. That’s because the gradients indicate the most effective perturbations of the edges to change a prediction. However, these edges do not constitute a graph to maximally preserve the GNN prediction. Based on the human study, Grad should be first excluded.

  • GAT, GNNExplainer, and PGExp are outperformed by GNN-MOExp in both metrics on all datasets. Clearly, these baselines do not explicitly optimize both objectives.

  • MOEB has the worst or second-worst simulatability on the latter 6 datasets, though it is the runner-up on the first three. Based on the human study, MOEB is not guaranteed to generate explanations that will likely be accepted.

  • Shapley has the best counterfactual relevance on the first three datasets, with GNN-MOExp as the runner-up. On the remaining 6 datasets, GNN-MOExp outperforms or is close to Shapley. On simulatability, GNN-MOExp outperforms Shapley on all datasets.

  • GNN-MOExp is the best in simulatability on all baselines on all datasets, and is frequently outperforming or competitive with the feasible runner-ups (Grad and MOEB are not feasible due to their low simulatability).

Fig. 9: Average Running time for each node of maximum search distance and maximum subgraph complexity on Cora and Citeseer.

Parameters Sensitivity. We search subgraphs of nodes involving vertices that are hops away from the target node (by default , the depth of the target GNN). The sensitivity analyses of these parameters are shown in Fig. 8. We can see that the performance of simulatability becomes better as the parameters increase, while the performance of counterfactual relevance becomes lower. We let since large explaining subgraphs go against explanation simplicity and simulatability. Since is usually small ( in our experiments) to avoid over-smoothing [Li2018DeeperII], we can see the performance level off when .

One bottleneck of applying GNN-MOExp to real-world graphs is its running time [Rudin2019GloballyConsistentRS]. In Fig. 9 we can see that the running time increases as the search space grows with and . However, on average, enumerating and evaluating all acyclic and connected subgraphs of a target node on Cora and Citeseer with some very high node degrees, take no more than 3 seconds on a commodity computer. With an incremental implementation, a newly added edge only leads to enumerating new subgraphs containing the new edge. Given the reasonable running time, the capability of guaranteeing Pareto optimality and simultaneous high simulatability and counterfactual relevance is a unique advantage that gradient-based methods do not have. Explaining GNN with a quality guarantee is a must-have when GNN is used in user-centric applications, such as graph-based recommendation systems [Ying2018].

V-C Robustness and Sanity check

We design two ways to perturb GNN predictions. We can link an existing vertex to the target node and add a message to Eq. (2) at the last layer of GNN:


where is the perturbed activation. We measure the strength of the perturbation caused by using


where is the predicted class of before the perturbing edge is added. Second, we randomize the GNN parameters of layer , which is the last layer of GNN. We measure the perturbation strength using Euclidean distance between the original parameters and the perturbed parameters


Given a perturbation, we need to measure the change in the explaining subgraph of . Let the explaining subgraphs after the perturbation be denoted by . We measure the average distance between two explaining subgraph , where is Jaccard distance between two vertex sets.

Fig. 10: Robustness and sanity check. Left: Jaccard distance changes due to perturbed messages. Right: Jaccard distance changes due to perturbed GNN parameters.

From Fig. 10, we can observe that the subgraph explanations found by our method pass the sanity check. We have the following observations. i) There is no change in the predicted class by the target GNN when the perturbing message is aligned with (high cosine similarity) or the perturbing distance is small, and predictions start to change when the perturbations are sufficiently strong. ii) The Jaccard distance between two optimal explaining subgraphs becomes larger as predicted class changes, demonstrated by the red curves on top of the blue curve. iii) Interestingly, on the left, even when there is no change in the predicted class, first increases as cosine similarity decreases to 0 ( is orthogonal to ), and then decrease again when further decreases to negative values ( is in the opposite direction of ). We conjecture that the edge () is added to the explaining graph in the former situation, while some message cancel out the opposite in the latter case (though there may not always be such a canceling message). The explanations are more robust to perturbing as the remains low if predictions remain the same (right figure). The explanations are more sensitive to perturbing incoming messages (left figure). In such cases, on average less than two edges are perturbed in the explaining subgraphs.

V-D Reproducibility checklist

We adopt a Graph Convolutional Network (GCN) model [kipf2017gcn] as the explained target model, with two hidden layers (

), each with 16 neurons. The dimension of the input layer is the number of input features of the nodes, and the dimension of the output layer is the number of classes. We adopt the cross-entropy loss function and the Adam optimizer for training the GNN, while the learning rate is set to be 0.01. We set the maximal training iterations to 500, and apply the early-stop strategy when training.

As for the proposed GNN-MOExp, there are two hyper-parameters. We set the maximum search distance , which is equal to the depth of GCN, and we set the maximum subgraph complexity , considering both the effectiveness and the explanation simplicity.

Vi Related Work

Explainable ML. The simulatability and counterfactual relevance are two major metrics for evaluating explanations, but their interactions and how humans perceive them are not clear. In  [Lundberg2017UnifiedApproach] and  [Shrikumar2017DEEPLIFT], they provide a prediction explanation framework based on Shapley values which encompasses LIME as a special case. Two algorithms with linear complexity for feature importance scoring are developed in [chen2018shapley]. In [Ghorbani2019DATAshapley] and [Ancona2019DNNshapley], they approximate Shapley values for deep networks via sampling. The methods proposed in  [darwiche2003differential, chan2005sensitivity] use gradients to find salient subgraphs to explain the inference on PGM, but not for GNNs [kipf2017gcn, hamilton2017graphsage, velivckovic2018gat].  [ying2019gnn] explains arbitrary graph neural networks using a simplified model.  [baldassarre2019explainability] studies the influence of the change of inputs on outputs of GNN models with gradient-based and decomposition-based methods. Stochastic explaining subgraph search have been proposed [yuan2020xgnn, Vu2020PGMExplainerPG, yuan2021explainability]

using reinforcement learning and hill-climbing. In

[yuan2021explainability], Monte Carlo search is used for exploration.

Causal Inference and Counterfactual Reasoning. [guo2020survey] introduces both traditional and advanced methods in learning causal effect and causal relations. In [guo2020learning], they discover the unknown confounders from observed data, by learning representations of confounders using GNN. We identify confounders on the computational graph of GNN.

Robustness and sensitivity. Explanation robustness and sensitivity are two desired properties and have been mostly studied on images [Adebayo2018, Ghorbani2017, Zhang2018, Yeh2019, Pruthi2019] and texts [Pruthi2019], but none on graphs. The differential geometry formulation of manipulability of gradient-based explanations in [Adebayo2018] assumes that the input is a vector (image) that lies on a low-dimensional manifold. For GNN, a decision of a node depends not only on its feature vectors, but also on the messages from neighboring nodes. On graphs, the only relevant study is [wiltschko2020], and the proposed method differs from [wiltschko2020]

in explanation generation (subgraph search vs. gradient-based) and evaluation metrics (output explanation changes vs. attribution accuracy changes).

Vii Conclusion and future work

We proposed to find multi-objective explanations for Graph Neural Networks, with two objectives, simulatability and counterfactual relevance, to be satisfied. The human study showed that the two explanation objectives can represent the perceived quality of explanations based on two different cognitive processes (quick screening vs. effortful deliberation), and they jointly influence and predict explanation acceptance by humans. We proposed to maximize the two objectives by subgraph enumeration and ranking-based optimization to produce Pareto optimal explanations that fulfill both objectives. We showed that gradient-based GNN explanations are not robust against the rotation of incoming messages to the target nodes, while GNN-MOExp can reliably output quality explanations. Extensive experiments on 9 graph datasets from 4 applications demonstrated superior performance in simulatability, counterfactual relevance, robustness, and sensitivity.


Chao and Sihong were supported in part by the National Science Foundation under Grants NSF IIS-1909879, NSF CNS-1931042, and NSF IIS-2008155. Any opinions, findings, conclusions, or recommendations expressed in this document are those of the author(s) and should not be interpreted as the views of any U.S. Government. Yifei, Yazheng, and Xi were supported by Natural Science Foundation of China (No.61976026) and 111 Project (B18008).