1 Introduction
Knowledge Bases are collection of factual information in the form of relational triplets. Each relational triplets can be represented as (,r,) where and are entities in knowledge base and r is the relation between and . The most popular way of visualising knowledge base is by representing them as multi relational graph where each triplet (,r,) is represented as directed edge from to with label r. Knowledge Bases have used to improve performance across variety of tasks like Question Answering (Sorokin and Gurevych (2018), Cui et al. (2017)), Dialogue Generation (Liu et al. (2018)) and many others.
However, since Knowledge Bases are populated from automatic mining from texts, they are often incomplete since it is not possible to manually write all the facts, and there are often inaccuracies in extraction. This inaccuracy leads to a decline in performance across a variety of downstream tasks. Hence, there has been a lot of work in coming up with an efficient tool to complete the Knowledge Bases (KBs) by automatically adding new facts without requiring extra knowledge. This task is referred to as Knowledge Base Completion (or Link Prediction), where the goal is to solve queries like (,r,?).
The first approach towards efficient Knowledge Base Completion were additive models like TransE (Bordes et al. (2013)) and TransH (Wang et al. (2014)) where relations were interpreted as simple translations over hidden entity representations. Multiplicative models like Distmult (Yang et al. (2015)) and Complex (Trouillon et al. (2016)) were then observed to outperform these simple additive models. Instead of translation, RotatE (Sun et al. (2019a)) defines relation as simple rotations such that the head entity can be rotated in the complex embedding space to match the tail entity, which has been shown to satisfy a lot of useful semantic properties like compositionality of relations. Recently, more expressive Neural Networkbased methods ( like ConvE (Dettmers et al. (2018)) and ConvKB(Nguyen et al. (2018))) were introduced where the scoring function is learned along with the model. However, all these models process each triplet independently. As a result, these methods cannot capture semantically rich neighborhood and hence produce lowquality embeddings.
Graphs have been widely used to visualize realworld data. There has been tremendous progress in applying ML techniques over images and text, some of which are being successfully adapted to graphs (like Kipf and Welling (2017), Hamilton et al. (2017), Velickovic et al. (2018). Taking inspiration from this approach, a number of Graph Neural Networkbased methods have been proposed to capture neighborhood in Knowledge Graphs for the KBC task. In this survey, we aim to look into some of these formulations.
2 Dataset
Dataset  Entities  Relations  Train  Valid  Test 
FB15k237  14,541  237  272,115  17,535  20,466 
FB15k  14,951  1,345  483,142  50,000  59,071 
WN18  40,943  18  141,442  5,000  5,000 
WN18RR  40,943  11  86,835  3,034  3,134 
The approaches for Knowledge Graph Completion have been evaluated on a variety of benchmarks datasets. Now, we will study in brief details about each of these large KGs used to assess performance in KBC tasks. Table 1 shows a description of the sizes of these benchmark datasets.

FB15k  FB15k is a subset of relational database Freebase and is a popular benchmark for Knowledge Graph Completion. In Toutanova and Chen (2015), a serious issue was observed with this dataset. The dataset consist of inverse triplet like () in test set such that () exists in the training set. Thus a simple model that can memorize the triplets in the training set can achieve very high test accuracy in this dataset. As a result, they proposed a new dataset Fb15k237, where they had removed such inverse triplets.

WN18  WN18 is a subset of WordNet KB containing lexical relation between words. Similar to FB15K, it contained inverse relations, and hence WN18RR was introduced. WN18RR has a hierarchical structure, and as a result, it poses a significant challenge to all KBC approaches that do not handle transitive relations.
Knowledge Graph Completion methods are evaluated using test queries (,r,?), where we create a ranking list of all entities using the method’s scoring function. We then compute metrics like Mean Reciprocal Rank that measures the average of the reciprocal rank of correct result for a given query, and Hits@N where measures number of times occurs in top N of the ranked list.
3 Basic Concepts
In this section, we will talk about some of the basic concepts related to Graph Neural Networks.
3.1 Message Passing Neural Network
Message Passing Network (Gilmer et al. (2017)) is a framework that aims to generalize the various neural network proposed for graphbased data. Let us look at graph G with node embeddings for all nodes v, edge embeddings for all edges between v and w. We first define message function as given below
where f is nonlinear function and is usually taken as MLP. Then the net message sent to node v is computed by aggregating the message from all its neighbours w using the equation below
where N(v) is neighbour set of v, t is timestep when messages are being aggregated and AGG is aggregation function that can be sum, mean or any other function. Finally, the embedding for node v are updated using the update function UPD as given below
3.2 Graph Convolution Network
Graph Convolution Network is based on spectral graph convolution networks. Here the dataset consists of a graph with vertex set, adjacency matrix, and feature set for each node. These features could be categorical attributes like node labels, structural features like node degrees as well as the simple onehot encoding for each node. GCNs aggregate information from neighbors of a node by using the simple equation given below.
3.3 Graph Attention Network
In GCNs, all the neighbors contribute equally to the aggregation for each target node. Graph Attention Networks (GATs) [Velickovic et al. (2018)] overcome the shortcomings of the previous works by learning to assign varying levels of importance to all the nodes in the neighborhood of the node under consideration, rather than treating each node as equally important or using a fixed weight. Moreover, GATs are generalizable to unseen nodes (inductive learning), thus simulating a realworld setting and have been shown to advance state of the art across a variety of tasks.
Model  Uniform Aggregation of Neighbours  Relation Embedding  Use Decoder  Incorporate prior rules  Provide Explanation  Use Attribute 
RGCN Schlichtkrull et al. (2018)  Yes  No  Yes  No  No  No 
TransGCN Cai et al. (2019)  Yes  Yes  No  No  No  No 
KBGAT Nathani et al. (2019)  No (attention)  Yes  Yes  No  No  No 
SACN Shang et al. (2019)  No (relation weighted)  Yes  Yes  No  No  Yes 
ExpressGNN Zhang et al. (2020)  Yes  Yes  No  Yes  No  No 
DPMPN Xu et al. (2020)  Yes  Yes  No  No  Yes  No 
The input to GAT layer is a set of node features, , where is the number of nodes. Then attention weight for each edge is calculated using the equation given below.
(1) 
where W is weight matrix, a is any attention function, and is attention weight for the edge between node i and node j. For each node i, the weights are then normalized by passing through softmax function. The output embedding for each node is then computed by the equation given below
(2) 
3.4 Decoder
A lot of approaches that utilises GNN for KBC tasks uses them as auto encoder to help provide information of neighbouring relation triplets. Then, preexisting scoring function like DistMult is enriched with this information in the form of initialisation of it’s entity embeddings and outperforms the method that uses only scoring function. This scoring function is referred to as decoder in literature.
4 GCN Encoder for Knowledge Graph Completion
RGCN (Schlichtkrull et al. (2018)) was one of the earliest approaches to use GNNs for this task. They introduce a relational Graph Convolution Network, which produces localitysensitive embeddings, which are then passed to the decoder that predicts missing links in KG. It is important to note that simple GCN cannot be used to embed KGs because it ignores the edge labels in the graph. As a result, the RGCN slightly modified the scoring function of simple GCN to capture the relationship between the edges.
Here R is a set of relations, and denote the set of entities that connected to v by relation r. We can see the model learns a new weight matrix for each relation, thus making their approach nonscalable for large graphs. They try to resolve this issue by coming with a basis and block diagonal decomposition. In the basis decomposition, each relation’s weight matrix is represented as a sum of some base matrices. In the block diagonal decomposition, each relation weight matrix is represented as a direct sum over blockdiagonal matrices. These approaches not only help to solve overfitting but also make relation weight matrices dependent on each other, thus helping to transfer knowledge learned from frequently occurring relation to learning weight matrices for rare relation better. They evaluate their approaches with state of the art multiplicative models like ComplEx and perform worse for FB15K and WN18 datasets. They then combine the RGCN scores with DistMult scores and achieve comparable results to existing SOTA. The paper’s main contribution is to show that GNNs can be successfully applied to KGs. However, they do not look into differential weighing of the node’s neighborhood as well as require decoder since they do not learn relation embedding through GNN framework
5 Relation Weighted GCN Encoder
SACN (Shang et al. (2019)) try to extend on previous work by aggregating information from the node’s neighborhood by being sensitive to edge relation types. This is referred to as Weighted Graph Convolution Network (WGCN). In WGCN, the whole graph is broken into subgraphs such that each subgraph contains edges of only one relation type. Then GCN is applied on each subgraph using the equation given below for relation t.
Here is embedding for node v at layer , specifies the neighborhood of node v for relation type t and are node embedding at layer g is simple matrix multiplication with connection coefficient matrix W which is shared across all relations which is in contrast to RGCN which have separate weights for each relation. It then aggregates information across all these single relation subgraphs by weighing them with learnable parameter , which is different for each relation type t. Similar to the approach in RGCN, SACN uses WGCN as an encoder to learn highquality entity embeddings, which are then passed to decoder ConvTransE. ConvTransE is inspired by Dettmers et al. (2018)
’s success in applying convolution over embeddings. It, however, improves the preexisting formulation by not reshaping relation and entity embedding vectors, thus helping to retain translation property of embeddings described in
Bordes et al. (2013). Another interesting contribution of their work is that they also use node attributes to learn enhanced node embeddings by adding these node relation attributes as bridge nodes between the two entities connected by relation. The whole architecture is then learned end to end and is shown to achieve the state of the art results at that time over FB15k237 and WN18RR datasets. Since most of the previous datasets do not consider the large number of entity attributes available, they come up with a new dataset, FB15k237Attr, that extracts the attribute triplets for entities in FB15k237. They further show that their model can use this additional information to improve performance on link prediction further.6 Graph Attention Encoder
Both of the two methods discussed above share the weakness of treating all neighboring nodes for each relation with equal importance. To overcome this limitation, KBGAT (Nathani et al. (2019)) incorporates attention to identify important information in the neighborhood. Similar to previous methodologies, they learn GNN encoder followed by Neural Networkbased scoring function as decoder i.e., ConvKB (Nguyen et al. (2018)). Unlike the earlier approaches, they use GNNs to learn both entity and relation embeddings. For each triplet (), they represent a relation edge in the graph and compute representation for it using the equation given below.
Here, , are entity embeddings and is relation embedding. These embeddings are first initialized by the additive TransE model. These embeddings are concatenated and multiplied by the learned weight matrix and
, before being passed through nonlinear activation. The weight matrix for all the edges to target node i are then passed through the softmax layer to produce attention weights for each edge.
These are used to weight the edge representations to compute output entity embeddings. Also, in order to not lose initial information, we add initial embeddings with the transformation to the output of the GAT network to produce final embeddings. Moreover, in each GAT layer, the output relation embedding is computed by matrix multiplication with a learned weighted matrix to input relation embedding. The paper also introduces the concept of auxiliary edges by adding direct edges to nhop neighbors of the node where it experiments with 2 and 3 hop neighbors of a node. Finally, it learns the parameters of its GAT network by minimizing , which is the scoring function introduced in TransE (Bordes et al. (2013)
). It then uses these embeddings to provide structural information to the decoder ConvKB and outperform SOTA approaches on FB15K237, NELL995, and Kinship dataset. Their approach, however, struggles to improve SOTA performance over WN18RR, which they believe can be accustomed to the fact that they do not handle hierarchical relations. They also performed ablation study over the hyperparameters and observed that auxiliary edges did not give much gain in performance. They also analyzed the distribution of attention weights and how this distribution varies with increasing iteration. Initially, as the epoch increases, the GAT layer gives more importance to direct neighbors and does not incorporate much information from distant neighbors. Eventually, in the later epochs, the model learns to capture the nhop neighborhood of the node by assigning weights to auxiliary relations. They further analyzed how attention weights vary for different nodes by showing that for low indegree nodes, effective embeddings can not be learned by simply aggregating information from direct neighbors, and hence higher attention weights are assigned by the model to n hop neighborhood. However, it is not clear why their approach is not trained endtoend like earlier approaches. Moreover, since their GAT layer learns relation embedding, the motivation for using decoder is not very clear. It is also inferred in
Sun et al. (2019b), that their experimental methodology is faulty, which will be discussed in more detail in later sections.7 Standalone GNN for KBC
TransGCN (Cai et al. (2019)) is a GCN framework that jointly learns both entity and relation embeddings and hence dismisses the need for a decoder. It draws inspiration from RGCN and aims to create a methodology that dismisses the need of taskspecific decoder as well as avoids the computational cost of learning weights for relation twice one during encoder step and second in the form of relation embedding in decoder step. It aims to use relation embedding to transform existing entity embedding for nodes in the graph such that it becomes a homogeneous graph. The transformation can be depicted more clearly by defining two transformation operators . and * as and as mentioned in Cai et al. (2019). The GNN model’s scoring function for each triplet is then defined by how close the transformed head entity is to the tail entity and vice versa. These transformation operators are used to define the node’s neighbor by looking separately for incoming edges and outgoing edge. For an incoming edge (h,r,v), they transform entity h to and then include it in neighbourhood of v. Similarly for outgoing edge (v,r’,t), they transform entity t to . Since edge labels are already taken into account by transforming entity embeddings, we can then apply simple GCN on the constructed homogeneous graph. Since there is no motivation for the output entity embeddings to still follow the scoring function with input relation embeddings, the input relation embedding is transformed by weight matrix to produce output relation embedding. The complete GNN framework is then optimized to minimize the scoring function using the last layer’s entity and relation embeddings for valid triplets. They implement 2 GNN models one which uses TransE (Bordes et al. (2013)) scoring function called TransEGCN and other which uses RotatE (Sun et al. (2019a)) scoring function called RotatEGCN. Both TransEGCN and RotatEGCN outperform their base models TransE and RotatE correspondingly, and RotatEGCN achieves SOTA performance across FB15k237 and WN18RR. They further observed performance across nodes of varying degrees and showed that their performance struggled for node with low degrees where sufficient information is not available for learning useful embeddings and for nodes with a high degree where the model is not able to focus on important information for a particular node. It seems that both the above limitation can be solved using some of the methods introduced in KBGAT. For nodes with low indegree, auxiliary edges can be added to incorporate more information. For nodes with high indegree, attention mechanisms can be used to learn important signals in the neighborhood.
8 Incorporating rules with GNN
8.1 Markov Logic Networks
Markov Logic Networks (Richardson and Domingos (2006)) acts as an interface between Applications like Robotics and Representation like Embeddings in AI. First Order Logic cannot be used as the interface since it is brittle and cannot handle uncertainty. Probabilistic Graphical Models that can handle uncertainty can’t be used as an interface since they don’t handle objects and relations. Markov Logic Networks act as the intersection of the above approaches and represent each Knowledge Base as (F,w) where F is set of formula and w is the weight assigned to each formula. Then a Markov Network is constructed from KG by making a bipartite graph with the formula on one side and their grounded instances on the other side, as shown in 2. The formula and the corresponding ground predicates are connected by an edge, and then inference can be performed on this Markov Network. Then inference on Markov Network can be performed using the equation given below
Here O are labelled facts, H are unlabelled facts, are grounded predicates, are assignment function that define truth value of grounded predicates and Z(w) is normalising factor
8.2 MLN with GNN for inference
As we can see in the equation in the previous section, computing Z(w) requires summing over an exponential number of ground predicates. In ExpressGNN (Zhang et al. (2020), they use GNN to make this computation tractable. First of all, they represent KG as a bipartite graph with entities on one side and edges on the other side. They maximized the loglikelihood of observed facts ( referred to as ”” in Zhang et al. (2020)) by trying to optimize the ”variational evidence lower bound” of data. To do so, they computed posterior distribution, referred to as ”” in Zhang et al. (2020)
, of hidden triplets using GNN to aggregate information from neighboring observed triplets. They optimize this objective using the Expectation Maximisation Algorithm. In the Expectation Step, the model learns the posterior distribution by trying to make it as close to likelihood distribution as possible since it is a good assumption to make that hidden fact follow the same distribution as observed ones. They also show that their approach can use the labeled data in the expectation step by including the maximization of posterior distribution of labeled facts in their optimization function. Finally, in the maximisation step, they used the new posterior distribution to update their likelihood estimates. Also, it is important to note that the posterior distributions are calculated with the help of GNN by equation given below
where is a triplet, is embedding of the entities , Also, it is important to know that entity embeddings were concatenation of GNN computed embedding formed by aggregation of messages from neighbours and tunable embedding which was learned exclusively using EM objective and provided the model with additional flexibility. Finally, they test their approach on KBC tasks by studying link prediction performance on FB15k237 for 2 model variants, namely ExpressGNNE, which used only Expectation step and ExpressGNNEM, which used the complete EM algorithm. They observed SOTA performance outperforming the existing models by a huge margin. They also observed that their model performed very well even with low amounts of training data since they can incorporate logical rules in the formulation. Finally, they also showed that their model is more capable of performing on relations with very low observed data since the model is able to use logical knowledge to make informed predictions.
9 Explainable predictions with GNN
DPMPN (Xu et al. (2020)) aims to construct query dependent subgraph that provides some explanation about the prediction made by the GNN model, thus helping us to reason about predictions rather than treating the GNN model as a black box. Moreover, in these subgraphs, we could use attention weights on each node to perform differential coloring of each node, which can show which nodes were important for prediction and provide intuition on how information travels from head entity to tail entity. They take inspiration from Bengio (2017), where their modeling consists of 2 GNNs  the first GNN operates on the whole graph and tries to learn query independent properties of the graph. In contrast, the second GNN operates on the subgraph surrounding the head entity and provides query specific information as well as construct subgraph that provides reasoning for predictions. The first global GNN is referred to as ”Inattentive GNN” (IGNN), whereas the second GNN that looks at query local features is referred to as ”Attentive GNN” (AGNN) in Xu et al. (2020).
9.1 Ignn
To make IGNN scalable, there is only one forward pass through IGNN in batch since it is query invariant. The message passing algorithm (Gilmer et al. (2017)) is repeated for t timesteps where the hidden state is updated at each time step, and then the hidden state at the last time step is used for downstream models i.e., AGNN and attention module.
9.2 Agnn
In AGNN, they run GNN on subgraph constructed based on input query. For the query (e1,r,?), they start from subgraph that contains only the entity e1. They then expand this subgraph by adding neighbors of entity e1 and so on. They use three sampling strategies. First, to construct a subgraph at timestep t+1, they sample a subset of nodes from subgraph at time t, which is known as attendingfrom horizon denoted by . They use attention scores from timestep t to sample top k nodes with the highest attention weight. We will discuss how they compute these attention scores () using the attention module in the next subsection. They then look at neighbors of these sampled nodes which are not already included in subgraph at timestamp t. They then sample nodes from this neighborhood, which is referred to as the sampling horizon. Now they want to further sample from this sampling horizon, for which they use attention scores () to samples top K nodes from sampling horizon to produce attending tohorizon. They then construct edges to these sampled nodes to aggregate messages for GNN at timestep t. Further, they use the attention scores to update the hidden states for node and then implement the Message Passing network inspired from Gilmer et al. (2017).
9.3 Attention module
The attention module first computes the transition matrix T, which captures the interaction between every pair of nodes. For the pair (v,v’), they compute the net interaction as a sum of 2 interactions the first interaction is between the underlying representation of v and v’ for query dependent AGNN network, which measures similarity between 2 nodes close to query. The other is between the queryspecific AGNN representation of v and IGNN representation of v’ to capture the similarity of nodes that were never included in the subgraph but could play a role in predicting the tail entity. They then compute output attention weights from the equation given below
At the last time step, these attention scores are used as inference probabilities for the query.
Method  FB15k237  WN18RR  FB15k  WN18  
MRR  H@10  MRR  H@10  MRR  H@10  MRR  H@10  
ComplEx  0.25  0.43  0.44  0.51  0.692  0.840  0.941  0.947 
RGCN          0.651  0.825  0.814  0.955 
RGCN+          0.696  0.842  0.819  0.964 
TransEGCN  0.315  0.477  0.233  0.508         
RotatEGCN  0.356  0.555  0.485  0.578         
KBGAT*  0.518  0.626  0.440  0.581         
SACN  0.35  0.54  0.47  0.54         
SACNAttr  0.36  0.55             
DPMPN  0.369  0.53  0.482  0.558         
ExpressGNN  0.49  0.608             
9.4 Analysis
The model attains slightly worse results to SOTA performance on MRR for WN18RR and FB15k237. However, they achieve very good performance for Hits@1 and Hits@3, showing that their model is very good at exact predictions. Moreover, they perform ablation study where they show that sampling more node for attending from horizon always give some gain in performance. However, time cost also increases tremendously in sampling more nodes for attending from the horizon. As a result, they have to restrict themselves to sampling only a few nodes to make their approach scalable.
10 Result
Complex results for FB15k237 and WN18RR are taken from Shang et al. (2019),Xu et al. (2020) and for FB15k and WN18RR are from (Schlichtkrull et al. (2018)). RGCN (Schlichtkrull et al. (2018)) results are comparable to complex when used in conjunction with DistMult scores (RGCN+), thus it is inferior in performance to other GNN based models which outperform ComplEx. ExpressGNN is the best performing in Fb15k237, which shows huge gain can be made in performance by incorporation logic rules in the embedding based framework. Moreover, it can also be seen that incorporating the abundant entity attribute information can give some increase in performance. Also, for WN18RR, RotatEGCN seems to be the best performing model since WN18RR has a lot of symmetric relations. All models other than RotateGCN cannot capture symmetric relations very well; however, as shown in Sun et al. (2019a), Rotation transformations can infer the symmetry/antisymmetry pattern.
10.1 Incorrect Evaluation Methodology
In Sun et al. (2019b)
, they observed that for some KBC methods like CapsE, ConvKB, and KBGAT, the resulting score distribution was very different from those obtained by earlier proposed scoring function like ConvE. They found out that a lot of triplets have the exact same score. They traced this unusual behavior to lot neurons becoming inactive due to RELU activation function used in these ranking methods. Thus, to reason about the performance of these methods, it became imperative to understand how these methods deal with similar scores. They observed that KBGAT’s reported scores are observed by placing the correct triplet at the beginning. If the correct triplet is placed randomly in the ranked list, then we find a sharp decline in performance. As a result, we have excluded KBGAT from our discussions and are interested in looking into evaluation methodology followed for other ranking functions.
11 Concluding Remarks
We can observe Graph Neural Networks have been widely adopted to improve SOTA performances for KBC tasks. However, this success has not been limited to only link prediction tasks, but there is a lot of active research is using GNNs on Knowledge Graphs for other related tasks. Recently,Zhang and Chen (2018)
has used GNNs to predict heuristics that can better help estimate the likelihood of 2 nodes to be connected to each other. Moreover, all Knowledge Graph Completion methods make an inherent assumption that all the entities in test triplets have been seen in training time, but this assumption might not hold in realworld scenarios.
Hamaguchi et al. (2017) look into solving this problem by using GNN to compute a representation of entities, not seen in training time, from their neighborhood at test time. Wang et al. (2019) extends this idea by using rules along with GNN for the above task. Zhang et al. (2019) utilizes GNN to perform entity linking over large KGs.As can be seen in Table 2, there is no single methodology that is looking into all the benefits that can be achieved by GNN. ExpressGNN and DPMPN networks don’t seem to be completely handling heterogeneity in the graph. When they aggregate messages from their neighborhood, although the relation embeddings are an input to compute messages from each neighbor, they do not weigh these messages based on the relation like in KBGAT or SACN. Also, it would be interesting to see if some of the DPMPN ideas of explainability can be applied to ExpressGNN easily. Finally, it has been argued that a lot of models struggle to perform over hierarchical graphs like WN18RR. This is because none of the GNN based approaches so far have looked into effectively handling hierarchical relations. Balazevic et al. (2019) has tried to project KGs into hyperbolic space for this purpose, but it still looks at each triplet independently. Future work on using GNN on hyperbolic space should improve state of the art over hierarchical graphs like WN18RR.
References
 Balazevic et al. (2019) Ivana Balazevic, Carl Allen, and Timothy M. Hospedales. 2019. Multirelational poincaré graph embeddings. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, 814 December 2019, Vancouver, BC, Canada, pages 4465–4475.
 Bengio (2017) Yoshua Bengio. 2017. The consciousness prior. CoRR, abs/1709.08568.
 Bordes et al. (2013) Antoine Bordes, Nicolas Usunier, Alberto GarcíaDurán, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multirelational data. In Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 58, 2013, Lake Tahoe, Nevada, United States, pages 2787–2795.
 Cai et al. (2019) Ling Cai, Bo Yan, Gengchen Mai, Krzysztof Janowicz, and Rui Zhu. 2019. Transgcn: Coupling transformation assumptions with graph convolutional networks for link prediction. In Proceedings of the 10th International Conference on Knowledge Capture, KCAP 2019, Marina Del Rey, CA, USA, November 1921, 2019, pages 131–138. ACM.
 Cui et al. (2017) Wanyun Cui, Yanghua Xiao, Haixun Wang, Yangqiu Song, Seungwon Hwang, and Wei Wang. 2017. KBQA: learning question answering over QA corpora and knowledge bases. PVLDB, 10(5):565–576.

Dettmers et al. (2018)
Tim Dettmers, Pasquale Minervini, Pontus Stenetorp, and Sebastian Riedel. 2018.
Convolutional 2d knowledge graph embeddings.
In
Proceedings of the ThirtySecond AAAI Conference on Artificial Intelligence, (AAAI18), the 30th innovative Applications of Artificial Intelligence (IAAI18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI18), New Orleans, Louisiana, USA, February 27, 2018
, pages 1811–1818. AAAI Press. 
Gilmer et al. (2017)
Justin Gilmer, Samuel S. Schoenholz, Patrick F. Riley, Oriol Vinyals, and
George E. Dahl. 2017.
Neural
message passing for quantum chemistry.
In
Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 611 August 2017
, volume 70 of Proceedings of Machine Learning Research, pages 1263–1272. PMLR.  Hamaguchi et al. (2017) Takuo Hamaguchi, Hidekazu Oiwa, Masashi Shimbo, and Yuji Matsumoto. 2017. Knowledge transfer for outofknowledgebase entities : A graph neural network approach. In Proceedings of the TwentySixth International Joint Conference on Artificial Intelligence, IJCAI 2017, Melbourne, Australia, August 1925, 2017, pages 1802–1808. ijcai.org.
 Hamilton et al. (2017) William L. Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 49 December 2017, Long Beach, CA, USA, pages 1024–1034.
 Kipf and Welling (2017) Thomas N. Kipf and Max Welling. 2017. Semisupervised classification with graph convolutional networks. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 2426, 2017, Conference Track Proceedings. OpenReview.net.
 Liu et al. (2018) Shuman Liu, Hongshen Chen, Zhaochun Ren, Yang Feng, Qun Liu, and Dawei Yin. 2018. Knowledge diffusion for neural dialogue generation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 1520, 2018, Volume 1: Long Papers, pages 1489–1498. Association for Computational Linguistics.
 Nathani et al. (2019) Deepak Nathani, Jatin Chauhan, Charu Sharma, and Manohar Kaul. 2019. Learning attentionbased embeddings for relation prediction in knowledge graphs. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28 August 2, 2019, Volume 1: Long Papers, pages 4710–4723. Association for Computational Linguistics.
 Nguyen et al. (2018) Dai Quoc Nguyen, Tu Dinh Nguyen, Dat Quoc Nguyen, and Dinh Q. Phung. 2018. A novel embedding model for knowledge base completion based on convolutional neural network. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACLHLT, New Orleans, Louisiana, USA, June 16, 2018, Volume 2 (Short Papers), pages 327–333. Association for Computational Linguistics.
 Richardson and Domingos (2006) Matthew Richardson and Pedro M. Domingos. 2006. Markov logic networks. Mach. Learn., 62(12):107–136.
 Schlichtkrull et al. (2018) Michael Sejr Schlichtkrull, Thomas N. Kipf, Peter Bloem, Rianne van den Berg, Ivan Titov, and Max Welling. 2018. Modeling relational data with graph convolutional networks. In The Semantic Web  15th International Conference, ESWC 2018, Heraklion, Crete, Greece, June 37, 2018, Proceedings, volume 10843 of Lecture Notes in Computer Science, pages 593–607. Springer.
 Shang et al. (2019) Chao Shang, Yun Tang, Jing Huang, Jinbo Bi, Xiaodong He, and Bowen Zhou. 2019. Endtoend structureaware convolutional networks for knowledge base completion. In The ThirtyThird AAAI Conference on Artificial Intelligence, AAAI 2019, The ThirtyFirst Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27  February 1, 2019, pages 3060–3067. AAAI Press.
 Sorokin and Gurevych (2018) Daniil Sorokin and Iryna Gurevych. 2018. Modeling semantics with gated graph neural networks for knowledge base question answering. In Proceedings of the 27th International Conference on Computational Linguistics, COLING 2018, Santa Fe, New Mexico, USA, August 2026, 2018, pages 3306–3317. Association for Computational Linguistics.
 Sun et al. (2019a) Zhiqing Sun, ZhiHong Deng, JianYun Nie, and Jian Tang. 2019a. Rotate: Knowledge graph embedding by relational rotation in complex space. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 69, 2019. OpenReview.net.
 Sun et al. (2019b) Zhiqing Sun, Shikhar Vashishth, Soumya Sanyal, Partha P. Talukdar, and Yiming Yang. 2019b. A reevaluation of knowledge graph completion methods. CoRR, abs/1911.03903.
 Toutanova and Chen (2015) Kristina Toutanova and Danqi Chen. 2015. Observed versus latent features for knowledge base and text inference. In 3rd Workshop on Continuous Vector Space Models and their Compositionality.
 Trouillon et al. (2016) Théo Trouillon, Johannes Welbl, Sebastian Riedel, Éric Gaussier, and Guillaume Bouchard. 2016. Complex embeddings for simple link prediction. In Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 1924, 2016, volume 48 of JMLR Workshop and Conference Proceedings, pages 2071–2080. JMLR.org.
 Velickovic et al. (2018) Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph attention networks. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30  May 3, 2018, Conference Track Proceedings. OpenReview.net.
 Wang et al. (2019) PeiFeng Wang, Jialong Han, Chenliang Li, and Rong Pan. 2019. Logic attention based neighborhood aggregation for inductive knowledge graph embedding. In The ThirtyThird AAAI Conference on Artificial Intelligence, AAAI 2019, The ThirtyFirst Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27  February 1, 2019, pages 7152–7159. AAAI Press.
 Wang et al. (2014) Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen. 2014. Knowledge graph embedding by translating on hyperplanes. In Proceedings of the TwentyEighth AAAI Conference on Artificial Intelligence, July 27 31, 2014, Québec City, Québec, Canada, pages 1112–1119. AAAI Press.
 Xu et al. (2020) Xiaoran Xu, Wei Feng, Yunsheng Jiang, Xiaohui Xie, Zhiqing Sun, and ZhiHong Deng. 2020. Dynamically pruned message passing networks for largescale knowledge graph reasoning. In 8th International Conference on Learning Representations, ICLR 2020. OpenReview.net.
 Yang et al. (2015) Bishan Yang, Wentau Yih, Xiaodong He, Jianfeng Gao, and Li Deng. 2015. Embedding entities and relations for learning and inference in knowledge bases. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 79, 2015, Conference Track Proceedings.
 Zhang et al. (2019) Fanjin Zhang, Xiao Liu, Jie Tang, Yuxiao Dong, Peiran Yao, Jie Zhang, Xiaotao Gu, Yan Wang, Bin Shao, Rui Li, and Kuansan Wang. 2019. OAG: toward linking largescale heterogeneous entity graphs. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, Anchorage, AK, USA, August 48, 2019, pages 2585–2595. ACM.
 Zhang and Chen (2018) Muhan Zhang and Yixin Chen. 2018. Link prediction based on graph neural networks. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, 38 December 2018, Montréal, Canada, pages 5171–5181.
 Zhang et al. (2020) Yuyu Zhang, Xinshi Chen, Yuan Yang, Arun Ramamurthy, Bo Li, Yuan Qi, and Le Song. 2020. Efficient probabilistic logic reasoning with graph neural networks. ICLR, abs/2001.11850.
Comments
There are no comments yet.