Certifiable Robustness and Robust Training for Graph Convolutional Networks

06/28/2019 ∙ by Daniel Zügner, et al. ∙ 0

Recent works show that Graph Neural Networks (GNNs) are highly non-robust with respect to adversarial attacks on both the graph structure and the node attributes, making their outcomes unreliable. We propose the first method for certifiable (non-)robustness of graph convolutional networks with respect to perturbations of the node attributes. We consider the case of binary node attributes (e.g. bag-of-words) and perturbations that are L_0-bounded. If a node has been certified with our method, it is guaranteed to be robust under any possible perturbation given the attack model. Likewise, we can certify non-robustness. Finally, we propose a robust semi-supervised training procedure that treats the labeled and unlabeled nodes jointly. As shown in our experimental evaluation, our method significantly improves the robustness of the GNN with only minimal effect on the predictive accuracy.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Graph data is the core for many high impact applications ranging from the analysis of social networks, over gene interaction networks, to interlinked document collections. One of the most frequently applied tasks on graph data is node classification: given a single large (attributed) graph and the class labels of a few nodes, the goal is to predict the labels of the remaining nodes. Applications include the classification of proteins in interaction graphs (Hamilton et al., 2017), prediction of customer types in e-commerce networks (Eswaran et al., 2017), or the assignment of scientific papers from a citation network into topics (Kipf and Welling, 2017). While there exist many classical approaches to node classification (London and Getoor, 2014; Chapelle et al., 2006), recently graph neural networks (GNNs), also called graph convolutional networks, have gained much attention and improved the state of the art in node classification (Kipf and Welling, 2017; Defferrard et al., 2016; Gilmer et al., 2017; Klicpera et al., 2019).

However, there is one big catch: Recently it has been shown that such approaches are vulnerable to adversarial attacks (Zügner et al., 2018; Dai et al., 2018; Zügner and Günnemann, 2019): Even only slight deliberate perturbations of the nodes’ features or the graph structure can lead to completely wrong predictions. Such negative results significantly hinder the applicability of these models. The results become unreliable and such problems open the door for attackers that can exploit these vulnerabilities.

So far, no effective mechanisms are available, which (i) prevent that small changes to the data lead to completely different predictions in a GNN, or (ii) that can verify whether a given GNN is robust w.r.t. specific perturbation model. This is critical, since especially in domains where graph-based learning is used (e.g. the Web) adversaries are omnipresent, e.g., manipulating online reviews and product websites (Hooi et al., 2016). One of the core challenges is that in a GNN a node’s prediction is also affected when perturbing other nodes in the graph – making the space of possible perturbations large. How to make sure that small changes to the input data do not have a dramatic effect to a GNN?

In this work, we shed light on this problem by proposing the first method for provable robustness of GNNs. More precisely, we focus on graph convolutional networks and potential perturbations of the node attributes, where we provide:

  • Certificates: Given a trained GNN, we can give robustness certificates that state that a node is robust w.r.t. a certain space of perturbations. If the certificate holds, it is guaranteed that no perturbation (in the considered space) exists which will change the node’s prediction. Furthermore, we also provide non-robustness certificates that, when they hold, state whether a node is not robust; realized by providing an adversarial example.

  • Robust Training: We propose a learning principle that improves the robustness of the GNN (i.e. making it less sensitive to perturbations) while still ensuring high accuracy for node classification. Specifically, we exploit the semi-supervised nature of the GNN learning task, thus, taking also the unlabeled nodes into account.

In contrast to existing works on provable robustness for classical neural networks/robust training (e.g. (Wong and Kolter, 2018; Raghunathan et al., 2018; Hein and Andriushchenko, 2017)), we tackle various additional challenges: Being the first work for graphs, we have to deal with perturbations of multiple instances simultaneously. For this, we introduce a novel space of perturbations where the perturbation budget is constrained locally and globally. Moreover, since the considered data domains are often discrete/binary attributes, we tackle challenging

constraints on the perturbations. Lastly, we exploit a crucial aspect of semi-supervised learning by taking also the unlabeled nodes into account for robust training.

The key idea we will exploit in our work is to estimate the

worst-case change in the predictions obtained by the GNN under the space of perturbations. If the worst possible change is small, the GNN is robust. Since, however, this worst-case cannot be computed efficiently, we provide bounds on this value, providing conservative estimates. More technically, we derive relaxations of the GNN and the perturbations space, enabling efficient computation.

Besides the two core technical contributions mentioned above, we further perform extensive experiments:

  • Experiments: We show on various graph datasets that GNNs trained in the traditional way are not robust, i.e. only few of the nodes can be certified to be robust, respectively many are certifiably non-robust even with small perturbation budgets. In contrast, using our robust training we can dramatically improve robustness increasing it by in some cases by factor of four.

Overall, using our method, significantly improves the reliability of GNNs, thus, being highly beneficial when, e.g., using them in real production systems or scientific applications.

2. Related Work

The sensitivity of machine learning models w.r.t. adversarial perturbations has been studied extensively

(Goodfellow et al., 2015) . Only recently, however, researchers have started to investigate adversarial attacks on graph neural networks (Zügner et al., 2018; Dai et al., 2018; Zügner and Günnemann, 2019) and node embeddings (Bojchevski and Günnemann, 2019). All of these works focus on generating adversarial examples. In contrast, we provide the first work to certify and improve the robustness of GNNs. As shown in (Zügner et al., 2018), both perturbations to the node attributes as well as the graph structure are harmful. In this work, we focus on perturbations of the node attributes and we leave structure perturbations for future work.

For ‘classical’ neural networks various heuristic approaches have been proposed to improve the the robustness to adversarial examples

(Papernot et al., 2016). However, such heuristics are often broken by new attack methods, leading to an arms race. As an alternative, recent works have considered certifiable robustness (Wong and Kolter, 2018; Raghunathan et al., 2018; Hein and Andriushchenko, 2017; Croce et al., 2018) providing guarantees that no perturbation w.r.t. a specific perturbation space will change an instance’s prediction.

For this work, specifically the class of methods based on convex relaxations are of relevance (Wong and Kolter, 2018; Raghunathan et al., 2018). They construct a convex relaxation for computing a lower bound on the worst-case margin achievable over all possible perturbations. This bound serves as a certificate of robustness. Solving such convex optimization problems can often been done efficiently, and by exploiting duality it enables to even train a robust model (Wong and Kolter, 2018). As already mentioned, our work differs significantly from the existing methods since (i) it considers the novel GNN domain with its relational dependencies, (ii) it handles a discrete/binary data domain, while existing works have only handled continuous data; thus also leading to very different constraints on the perturbations, and (iii) we propose a novel robust training procedure which specifically exploits the semi-supervised learning setting of GNNs, i.e. using the unlabeled nodes as well.

3. Preliminaries

We consider the task of (semi-supervised) node classification in a single large graph having binary node features. Let be an attributed graph, where is the adjacency matrix and represents the nodes’ features. W.l.o.g. we assume the node-ids to be . Given a subset of labeled nodes, with class labels from , the goal of node classification is to learn a function which maps each node to one class in . In this work, we focus on node classification employing graph neural networks. In particular, we consider graph convolutional networks where the latent representations at layer are of the form



and with activation functions given by

The output

denotes the probability of assigning node

to class . The are the message passing matrices that define how the activations are propagated in the network. In GCN (Kipf and Welling, 2017), for example, , where and . The and are the trainable weights of the graph neural network, usually simply learned by minimizing the cross-entropy loss on the given labeled training nodes .

Notations: We denote with the -hop neighborhood of a node , i.e. all nodes which are reachable with hops (or less) from node , including the node itself. Given a matrix , we denote its positive part with where the max is applied entry-wise. Similarly, the negative part is , which are non-negative numbers. All matrix norms used in the paper are meant to be entry-wise, i.e. flattening

to a vector and applying the corresponding vector norm. We denote with

the dimensionality of the latent space in layer , i.e. . denotes the -th row of a matrix and its -th column.

4. Certifying Robustness for Graph Convolutional Networks

Our first goal is to derive an efficient principle for robustness certificates. That is, given an already trained GNN and a specific node under consideration (called target node), our goal is to provide a certificate which guarantees that the prediction made for node will not change even if the data gets perturbed (given a specific perturbation budget). That is, if the certificate is provided, the prediction for this node is robust under any admissible perturbations. Unlike existing works, we cannot restrict perturbations to the instance itself due to the relational dependencies.

However, we can exploit one key insight: for a GNN with layers, the output of node depends only on the nodes in its hop neighborhood . Therefore, instead of operating with Eq. (1), we can ‘slice’ the matrices and at each step to only contain the entries that are required to compute the output for the target node .222Note that the shapes of and do not change. This step drastically improves scalability – reducing not only the size of the neural network but also the potential perturbations we have to consider later on. We define the matrix slices for a given target as follows:333To avoid clutter in the notation, since our method certifies robustness with respect to a specific node , we omit explicitly mentioning the target node in the following.


where the set indexing corresponds to slicing the rows and columns of a matrix, i.e. contains the rows corresponding to the two-hop neighbors of node and the columns corresponding to its one-hop neighbors. As it becomes clear, for increasing (i.e. depth in the network), the slices of become smaller, and at the final step we only need the target node’s one-hop neighbors.

Overall, we only need to consider the following sliced GNN:


and . Here, we replaced the activation by its analytical form, and we denoted with the input before applying the , and with the corresponding output. Note that the matrices are getting smaller in size – with

actually reducing to a vector that represents the predicted log probabilities (logits) for node

only. Note that we also omitted the activation function in the final layer since for the final classification decision it is sufficient to consider the largest value of . Overall, we denote the output of this sliced GNN as . Here is the set of all parameters, i.e. .

4.1. Robustness Certificates for GNNs

Given this set-up, we are now ready to define our actual task: We aim to verify whether no admissible perturbation changes the prediction of the target node . Formally we aim to solve:

Problem 1 ().

Given a graph , a target node , and an GNN with parameters . Let denote the class of node (e.g. given by the ground truth or predicted). The worst case margin between classes and achievable under some set of admissible perturbations to the node attributes is given by


If for all , the GNN is certifiably robust w.r.t. node and .

If the minimum in Eq. (5) is positive, it means that there exists no

adversarial example (within our defined admissible perturbations) that leads to the classifier changing its prediction to the other class

– i.e. the logits of class are always larger than the one of .

Setting reasonable constraints to adversarial attacks is important to obtain certificates that reflect realistic attacks. Works for classical neural networks have constrained the adversarial examples to lie on a small -ball around the original sample measured by, e.g., the infinity-norm or L2-norm (Wong and Kolter, 2018; Raghunathan et al., 2018; Croce et al., 2018), often e.g. This is clearly not practical in our binary setting as an would mean that no attribute can be changed. To allow reasonable perturbations in a binary/discrete setting one has to allow much larger changes than the -balls considered so far.

Therefore, motivated by the existing works on adversarial attacks to graphs (Zügner et al., 2018), we consider a more realistic scenario: We define the set of admissible perturbations by limiting the number of changes to the original attributes – i.e. we assume a perturbation budget and measure the norm in the change to . It is important to note that in a graph setting an adversary can attack the target node by also changing the node attributes of its hop neighborhood. Thus, acts as a global perturbation budget.

However, since changing many attributes for a single node might not be desired, we also allow to limit the number of perturbations locally – i.e. for each node in the hop neighborhood we can consider a budget of . Overall, in this work we consider admissible perturbations of the form:


Challenges: There are two major obstacles preventing us from efficiently finding the minimum in Eq. (5). First, our data domain is discrete, making optimization often intractable. Second, our function (i.e. the GNN) is nonconvex due to the nonlinear activation functions in the neural network. But there is hope: As we will show, we can efficiently find lower bounds on the minimum of the original problem by performing specific relaxations of (i) the neural network, and (ii) the data domain. This means that if the lower bound is positive, we are certain that our classifier is robust w.r.t. the set of admissible perturbations. Remarkably, we will even see that our relaxation has an optimal solution which is integral. That is, we obtain an optimal solution (i.e. perturbation) which is binary – thus, we can effectively handle the discrete data domain.

4.2. Convex Relaxations

To make the objective function in Eq. (5) convex, we have to find a convex relaxation of the activation function. While there are many ways to achieve this, we follow the approach of (Wong and Kolter, 2018) in this work. The core idea is (i) to treat the matrices and in Eqs. (3,4) no longer as deterministic but as variables one can optimize over (besides optimizing over ). In this view, Eqs. (3,4) become constraints the variables have to fulfill. Then, (ii) we relax the non-linear constraint of Eq. (4) by a set of convex ones.

In detail: Consider Eq. (4). Here, denotes the input to the activation function. Let us assume we have given some lower bounds and upper bounds on this input based on the possible perturbations (in Section 4.5 we will discuss how to find these bounds). We denote with the set of all tuples in layer for which the lower and upper bounds differ in their sign, i.e. . We denote with and the tuples where both bounds are non-negative and non-positive, respectively.

Consider the case : We relax Eq. (4) using a convex envelope:

The idea is illustrated in the figure on the right. Note that is no longer the deterministic output of the given its input but it is a variable. For a given input, the variable is constrained to lie on a vertical line above the input and below the upper line of the envelope.

Accordingly, but more simply, for the cases and we get:

which are actually not relaxations but exact conditions. Overall, Eq. (4) has now been replaced by a set of linear (i.e. convex) constraints. Together with the linear constraints of Eq. (3) they determine the set of admissible and we can optimize over. We denote the collection of these matrices that fulfill these constraints by . Note that this set depends on since .

Overall, our problem becomes:


Here we introduced the constant vector , which is at position , at , and else. This notation clearly shows that the objective function is a simple linear function.

Corollary 4.1 ().

The minimum in Eq. (7) is a lower bound on the minimum of the problem in Eq. (5), i.e. .


Let be the perturbation obtained by Problem 1, and the resulting exact representations based on Eq. (3)+(4). By construction, . Since Eq. (7) optimizes over the full set its minimum can not be larger. ∎

From Corollary 4.1 it follows that if for all , the GNN is robust at node . Directly solving Eq. (7), however, is still intractable due to the discrete data domain.

As one core contribution, we will show that we can find the optimal solution in a tractable way. We proceed in two steps: (i) We first find a suitable continuous, convex relaxation of the discrete domain of possible adversarial examples. (ii) We show that the relaxed problem has an optimal solution which is integral; thus, by our specific construction the solution is binary.

More precisely, we relax the set to:


Note that the entries of are now continuous between 0 and 1, and we have replaced the norm with the norm. This leads to:


It is worth mentioning that Eq. (9

) is a linear problem since besides the linear objective function also all constraints are linear. We provide the explicit form of this linear program in the appendix. Accordingly, Eq. (

9) can be solved optimally in a tractable way. Since , we trivially have . But even more, we obtain:

Theorem 4.2 ().

The minimum in Eq. (7) is equal to the minimum in Eq. (9), i.e. .

We will proof this theorem later (see Sec. 4.4) since it requires some further results. In summary, using Theorem 4.2, we can indeed handle the discrete data domain/discrete perturbations exactly and tractably by simply solving Eq. (9) instead of Eq. (7).

4.3. Efficient Lower Bounds via the Dual

In order to provide a robustness guarantee w.r.t. the perturbations on , we have to find the minimum of the linear program in Eq. (9) to ensure that we have covered the worst case. While it is possible to solve linear programs ‘efficiently’ using highly optimized linear program solvers, the potentially large number of variables in a GNN makes this approach rather slow. As an alternative, we can consider the dual of the linear program (Wong and Kolter, 2018). There, any dual-feasible solution is a lower bound on the minimum of the primal problem. That is, if we find any dual-feasible solution for which the objective function of the dual is positive, we know that the minimum of the primal problem has to be positive as well, guaranteeing robustness of the GNN w.r.t. any perturbation in the set.

Theorem 4.3 ().

The dual of Eq. (9) is equivalent to:

subject to



The proof is given in the appendix. Note that parts of the dual problem in Theorem 4.3 have a similar form to the problem in (Wong and Kolter, 2018). For instance, we can interpret this dual problem as a backward pass on a GNN, where the and

are the hidden representations of the respective nodes in the graph. Crucially different, however, is the propagation in the dual problem with the message passing matrices

coming from the GNN formulation where neighboring nodes influence each other. Furthermore, our novel perturbation constraints from Eq. (8) lead to the dual variables and , which have their origin in the local () and global () constraints, respectively. Note that, in principle, our framework allows for different budgets per node. The term has its origin in the constraint . While on the first look, the above dual problem seems rather complicated, its specific form makes it amenable for easy optimization. The variables have only simple, element-wise constraints (e.g. clipping between

). All other terms are just deterministic assignments. Thus, straightforward optimization using (projected) gradient ascent in combination with any modern automatic differentiation framework (e.g. TensorFlow, PyTorch) is possible.

Furthermore, while in the above dual we need to optimize over and , it turns out that we can simplify it even further: for any feasible , we get an optimal closed-form solution for .

Theorem 4.4 ().

Given the dual problem from Theorem 4.3 and any dual-feasible value for . For each node , let be the set of dimensions corresponding to the largest values from the vector (ties broken arbitrarily). Further, denote with the smallest of these values. The optimal that maximizes the dual is the -th largest value from . For later use we denote with the set of tuples corresponding to these -largest values. Moreover, the optimal is .

The proof is given in the appendix. Using Theo. 4.4, we obtain an even more compact dual where we only have to optimize over . Importantly, the calculations done in Theo. 4.4 are also available in many modern automatic differentiation frameworks (i.e. we can back-propagate through them). Thus, we still get very efficient (and easy to implement) optimization.

Default value: As mentioned before, it is not required to solve the dual problem optimally. Any dual-feasible solution leads to a lower bound on the original problem. Specifically, we can also just evaluate the function once given a single instantiation for . This makes the computation of robustness certificates extremely fast. For example, adopting the result of (Wong and Kolter, 2018), instead of optimizing over we can set it to


which is dual-feasible, and still obtain strong robustness certificates. In our experimental section, we compare the results obtained using this default value to results for optimizing over . Note that using Theo. 4.4 we always ensure to use the optimal w.r.t. .

4.4. Primal Solutions and Certificates

Based on the above results, we can now prove the following:

Corollary 4.5 ().

Eq. (9) is an integral linear program with respect to the variables .

The proof is given in the appendix. Using this result, it is now straightforward to prove Theo. 4.2 from the beginning.


Since Eq. (9) has an optimal (thus, feasible) solution where is integral, we have and, thus, has to be binary to be integral. Since in this case the constraints are equivalent to the constraints, it follows that . Thus, this optimal solution of Eq. 9 is feasible for Eq. 7 as well. Together with it follows that . ∎

In the proof of Corollary 4.5, we have seen that in the optimal solution, the set indicates those elements which are perturbed. That is, we constructed the worst-case perturbation. Clearly, this mechanism can also be used even if (and, thus, ) is not optimal: simply perturbing the elements in . In this case, of course, the primal solution might not be optimal and we cannot use it for a robustness certificate. However, since the resulting perturbation is primal feasible (regarding the set ), we can use it for our non-robustness certificate: After constructing the perturbation based on , we pass it through the exact GNN, i.e. we evaluate Eq. (5). If the value is negative, we found a harmful perturbation, certifying non-robustness.

In summary: By considering the dual program, we obtain robustness certificates if the obtained (dual) values are positive for every . In contrast, by constructing the primal feasible perturbation using , we obtain non-robustness certificates if the obtained (exact, primal) values are negative for one . For some nodes, neither of these certificates can be given. We analyze this aspect in more detail in our experiments.

4.5. Activation Bounds

One crucial component of our method, the computation of the bounds and on the activations in the relaxed GNN, remains to be defined. Again, existing bounds for classical neural networks are not applicable since they neither consider constraints nor do they take neighboring instances into account. Obtaining good upper and lower bounds is crucial to obtain robustness certificates, as tighter bounds lead to lower relaxation error of the GNN activations.

While in Sec. 4.3, we relax the discreteness condition of the node attributes in the linear program, it turns out that for the bounds the binary nature of the data can be exploited. More precisely, for every node , we compute the upper bound in the second layer for latent dimension as


Here, denotes the selection of the -th largest element from the corresponding vector, and the sum of the largest elements from the corresponding list. The first term of the sum in Eq. (12) is an upper bound on the change/increase in the first hidden layer’s activations of node and hidden dimension for any admissible perturbation on the attributes . The second term are the hidden activations obtained for the (un-perturbed) input , i.e. . In sum we have an upper bound on the hidden activations in the first hidden layer for the perturbed input . Note that, reflecting the interdependence of nodes in the graph, the bounds of a node depend on the attributes of its neighbors .

Likewise for the lower bound we use:


We need to compute the bounds for each node in the hop neighborhood of the target, i.e. for a GNN with a single hidden layer () we have .

Corollary 4.6 ().

Eqs. (12) and (13) are valid, and the tightest possible, lower/upper bounds w.r.t. the set of admissible perturbations.

The proof is in the appendix. For the remaining layers, since the input to them is no longer binary, we adapt the bounds proposed in (Raghunathan et al., 2018). Generalized to the GNN we therefore obtain:

Intuitively, for the upper bounds we assume that the activations in the previous layer take their respective upper bound wherever we have positive weights, and their lower bounds whenever we have negative weights (and the lower bounds are analogous to this). While there exist more computationally involved algorithms to compute more accurate bounds (Wong and Kolter, 2018), we leave adaptation of such bounds to the graph domain for future work.

It is important to note that all bounds can be computed highly efficiently and one can even back-propagate through them – important aspects for the robust training (Sec. 5). Specifically, one can compute Eqs. (12) and (13) for all (!) and all together in time where is the number of edges in the graph. Note that can be computed in time by unordered partial sorting; overall leading to the complexity . Likewise the sum of top Q elements can be computed in time for every and , together leading to .

5. Robust Training of GNNs

While being able to certify robustness of a given GNN by itself is extremely valuable for being able to trust the model’s output in real-world applications, it is also highly desirable to train classifiers that are (certifiably) robust to adversarial attacks. In this section we show how to use our findings from before to train robust GNNs.

Recall that the value of the dual can be interpreted as a lower bound on the margin between the two considered classes. As a shortcut, we denote with  the -dimensional vector containing the (negative) dual objective function values for any class compared to the given class , i.e. . That is, node with class is certifiably robust if for all entries (except the entry at which is always 0). Here, denotes the parameters of the GNN.

First consider the training objective typically used to train GNNs for node classification:


where is the cross entropy function (operating on the logits) and the set of labeled nodes in the graph. denotes the (known) class label of node . To improve robustness, in (Wong and Kolter, 2018) (for classical neural networks) it has been proposed to instead optimize


which is an upper bound on the worst-case loss achievable. Note that we can omit optimizing over by setting it to Eq. (11

). We refer to the loss function in Eq. (

15) as robust cross entropy loss.

One common issue with deep learning models is overconfidence

(Lakshminarayanan et al., 2017), i.e. the models predicting effectively a probability of 1 for one and 0 for the other classes. Applied to Eq. (15), this means that the vector is pushed to contain very large negative numbers: the predictions will not only be robust but also very certain even under the worst perturbation. To facilitate true robustness and not false certainty in our model’s predictions, we therefore propose an alternative robust loss that we refer to as robust hinge loss:


This loss is positive if ; and zero otherwise. Put simply: If the loss is zero, the node is certifiably robust – in this case even guaranteeing a margin of at least to the decision boundary. Importantly, realizing even larger margins (for the worst-case) is not ‘rewarded’.

We combine the robust hinge loss with standard cross entropy to obtain the following robust optimization problem


Note that the cross entropy term is operating on the exact, non-relaxed GNN, which is a strong advantage over the robust cross entropy loss that only uses the relaxed GNN. Thus, we are using the exact GNN model for the node predictions, while the relaxed GNN is only used to ensure robustness. Effectively, if all nodes are robust, the term becomes zero, thus, reducing to the standard cross-entropy loss on the exact GNN (with robustness guarantee).

Certificate w.r.t

% Nodes





Figure 1. Certificates for a GNN trained with standard training on Cora-ML.

Nb. purity

Avg. Max Q robust