1. Introduction
Knowledge Graphs (KGs) have hugely impacted AIbased applications (on and off the Web) such as question answering, recommendation systems, and prediction services (Wang et al., 2017). A KG represents factual knowledge in triples of form (entity, relation, entity) e.g., (Plato, influences, Kant), in a largescale multirelational graph where nodes correspond to entities, and typed links represent relationships between nodes. Although quantitatively KGs are often largescale with millions of triples, this is nowhere near enough to capture the knowledge from real world. To address this problem, various link prediction approaches have been used so far, among which link prediction using KG embedding (KGE) attracted a growing attention. KGEs map entities and relations of a KG from a symbolic domain to a geometric space (e.g. a vector space). KGEs employ a score function to perform the link prediction task which runs over the learned embedding vectors (bold represents a vector) of a triple and computes its plausibility (defining positive or negative triples).
In this way, the major encoding capabilities of KGEs remain focused in the triple level. Therefore, capturing collective knowledge both from the structural and semantical aspects for groups of triples (sungraphs) stays dependent on the inherited characteristics from the underlying geometries. Within each geometry, the mathematical operations used in the score function make individual differences in encoding capability of KGEs. In most of the KGEs, while mapping each triple into a geometric space, naturally the subgraphs are also mapped. However, the extent to which the structure of subgraphs are preserved, remains limited to the characteristics of the considered geometric space, specially when a subgraph form complex and heterogeneous structures and motifs (i.e. statistically significant shapes distributed in the graph). Major part of the stateoftheart KGEs (Zhang et al., 2019a; Sun et al., 2019; Trouillon et al., 2016; Bordes et al., 2013a) are designed in flat geometries which do not intrinsically support structural preservation. Consequently, the returned scores of certain triples involved in such heterogeneous structures are measured inaccurately. Such heterogeneous structures are not only possible to occur between nodes connected with multiple relations but also can be caused by a single relation. While many KGE models have been proposed, very few work have investigated on KGEs for discovery and encoding of heterogeneous subgraphs and motifs. Here, we focus on the case of complex motifs in multirelational graphs that are caused by one single relation e.g., “follows”, or “influences”. The nature of these relations leads to heterogeneous and completely different shapes in subgraphs, which is a prevailing case in many KGs. In Figure 2, we illustrate two possible subgraphs with different motifs generated by “influences” relation on the YAGO knowledge graph. The first subgraph is captured as a loop structure which is constructed by 10 nodes. The second subgraph forms a path structure which is created by the same relation among another set of 10 nodes.
Let us explore the following link prediction task by RotatE model on a toy KG with such motifs. The assumption is that, the RotatE model returns wrong inferences when two subgraphs are structured in different shapes with a same relation. To show this let us represent the above motifs as and – each with a set of 10 connected nodes () with one relation (). Therefore, () represents the nodes of motif () where they form a loop, and () correspond to the nodes of another subgraph with the same relation but forming a path. For each triple in this graph e.g. (), , the vector representation using RotatE is . A complete representation of the motifs are then as following:
In order to compute the value of relation , we start with the loop structure, by replacing the corresponding vectors of each triple in the following (e.g., by replacing the left side of in the second triple equation of , we get ). By doing this to the end, we conclude that which means where is a complex number, therefore . This value is consistent for in the whole graph, therefore, this can be used to check whether the second motif is preserved. Here we replace the vectors as above and additionally include the value of driven from the first calculation. After some derivations, we have . With a simplification steps, this results in , from which the model infers that the triple () is positive. However, the actual shape of this motif is a path and this wrong inference shows a loop structure.
The root cause of this problem lies in the entitydependent nature of this relation. However, most of the KGE models such as RotatE (as well as TransE, ComplEx, QuatE) consider relations independent of entities. This is visible in the heat map illustration in Figure 2. As shown, is a wrong triple and should be ideally ranked as high as possible. All the models except TransE do not preserve the path structure and provide a low rank for this triple (infer it as a correct triple). Although TransE preserves the path structure (ranking low), it fails in preserving the loop structure as it infers five triples to be wrong (, , , , and ). This is due to the limitation of their geometry.
In order to tackle this problem, our novel KGE models dubbed FieldE employs DEs for embedding of KGs into a vector space. Differential Equations (DEs) are used as powerful tools with a general framework to accurately define connection between neighboring points laid on trajectories, implying the continuity of changes, and consequently describing the underlying geometry. This is specially important because the success of a KGE model depends on the way it correctly specifies the underlying geometry that describes the natural proximity of data samples in a geometric space (Mathieu and Nickel, 2020). Designing FieldE with a wellspecified geometry a) improves generalization, b) increases the interpretability. Firstorder Ordinary Differential Equations (FOODEs) are special class of DEs which represents a vector field on a smooth Riemannian manifold. Therefore, we particularly focus on FOODEs in designing our model. FieldE brings the power of DEs, embeddings and Neural Networks together and provides a fully comprehensive model capable of learning different motifs in subgraphs of a KG.
2. Related Work
Here, we collected prior work on the capability of embedding models in preserving subgraph structures, and semantically relational patterns. We also discuss the related work about the motifs and the use of DEs for different encoding purposes.
Learning Relational Motifs. The primary generation of embedding models includes a list of translationbased approaches where the encoding of motifs have only been discussed with the essence simple relational patterns such as 11 in TransE (Bordes et al., 2013b) and 1many, many1, and manymany in its follow up models (Ji et al., 2015; Lin et al., 2015; Wang et al., 2014). RotatE (Sun et al., 2019) is the first KGE model where rotational transformations have been used for encoding of more complex patterns such as symmetry, transitive, composition, reflexive, and inversion which also create complex subgraphs. Another group of KGEs which are using elementwise multiplication of transformed head and tail namely DisMult (Yang et al., 2015), ComplEx (Trouillon et al., 2016), QuatE (Zhang et al., 2019b), and RESCAL (Nickel et al., 2011), also belong to rotationbased models and some use the angle of transformed head and tail for measuring the correctness of the predicted links. Apart from some partial discussions about the capability of the models in encoding of relational patterns, capturing the motifs have not been directly targeted in any of these models. A recent KGE model named MuRP (Balazevic et al., 2019) sheds the lights into the capability of embedding models in learning hierarchical relations. It proposes a geometrical approach for multirelational graphs using Poincaré ball (Ji et al., 2016). However, our work not only covers hierarchical relations but also focuses on different complex motifs (e.g. simultaneous path and loop) constructed by one relation in multirelational graphs. In (Suzuki et al., 2018), a very specific derivation of TransE model for encoding hierarchical relations have been proposed. The existing embedding models have been compared to the four different versions of the proposed Reimanian TransE namely hyperbolic, spherical, and euclidean geometries. Theses versions are designed for a very specific conditions in different geometries, while our model uses a neural network which generalizes cases beyond these.
Learning with Differential Equations. In (Chen et al., 2018)
, a family of deep neural network models has been proposed which parameterizes the derivative of a hidden state instead of the usual specification of a discrete sequence of hidden layers. In this approach, ODEs are used in the design of the continuousdepth networks with the purpose of providing an efficient computation of the network output which brings memory efficiency, adaptive computation, parameter efficiency, scalable and invertible normalizing flows, and continuous timeseries models. It is applied for supervised learning on an image dataset, and timeseries prediction. Few works use Lorenz model in their approaches
(Bose et al., 2020; Chamberlain et al., 2017) which are not about knowledge graphs in our context.. This work used ODEs in the proposed approach without considering knowledge graphs and embeddings for link prediction tasks. In another recent work, the continuous normalizing flows have been extended for learning Riemannian manifolds (Mathieu and Nickel, 2020) in natural phenomena such as volcano eruptions. In this work, the manifold flows are introduced as solutions for ODEs which makes the defined neural network independent of the mapping requirements forced by the underlying euclidean space.3. Preliminaries and Background
This section provides the preliminaries of the Riemannian Geometry and its corresponding key elements as the required background for our model. The aim of this paper is to embed nodes (entities) of a KG on trajectories of vector fields (relation) laid on the surface of a smooth Riemannian manifold. Therefore, we first provide the mathematical definitions (Franciscus, 2011) for manifold, and Tangent Space followed by introducing vector field including differential equations.
Manifold
A dimensional topological manifold denoted by is a Hausdorff space with a countable base which is locally similar to a linear space. More precisely, is locally homeomorphic to where:

For every point , there is an open neighbourhood around and a homomorphism which maps to is called chart or the coordinate system. is the image of , which is called the coordinates of in the chart.

Let meaning that is partitioned into parts denoted by The set with domain for each , is called atlas of
Tangent Space
If we assume a particle as a moving object on a manifold , then at each point , the particle is free to move in various directions with velocity . The set of all the possible directions that a particle goes by passing point form a space called the Tangent space. Formally, given a point on a manifold , the tangent space is the set of all the vectors which are tangent to all the curves, passing through point . Let be a parametric curve on the manifold. maps to and passes through point . A curve in the local coordinate is , from to manifold and then to the local coordinate . represents the position of a particle on the local coordinate and the rate of the changes for different positions (velocity) on point is computed by
(1) 
where is ith component of the curve on the local coordinates. The tangent vector is velocity at point in the local coordinate. If we specify every possible curve passing , then all of the velocity vectors form a set called tangent space. In other words, the tangent space represents all possible directions in which one can tangentially pass through . In order to move in a direction with the shortest path, exponential map is used. The exponential map at point is denoted by . For a given small and , the map shows how a particle moves on through the shortest path from with initial direction in A first order approximation of exponential map is given by .
Riemannian Manifold
A tuple represents a Reimanian manifold where is a real smooth manifold with a Riemannian metric . The function
defines an inner product on the associated tangent space. The metric tensor is used to measure angle, length of curves, surface area and volume locally, and from which global quantities can be derived by integration of local contribution.
Vector Field
Let be a temporal evolution of a trajectory on a dimensional smooth Riemannian manifold and be a tangent bundle (the set of all tangent spaces on a Manifold). For a given Ordinary Differential Equation (ODE)
(2) 
is a vector field on the manifold, which shows the direction and the speed of movements along which the trajectory evolves on the manifold’s surface.
The vector field defines the underlying dynamics of a trajectory on a manifold and can get various shapes with different sparsity/density as well as various flows with different degrees of rotation. In field theory, this is formalized by two concepts of Divergence and Curl. Divergence describes the density of the outgoing flow of a vector field from an infinitesimal volume around a given point on the manifold, while a Curl represents the infinitesimal rotation of a vector field around the point. Here we present the formal definition of Divergence and Curl. Without loss of generality, let be a continuously differentiable vector field in the Cartesian coordinates. The divergence of is defined as follows
(3) 
At a given point if , then the point is a source, i.e. the outflow on the point is more than the inflow. Conversely, if , then the point is sink i.e. the inflow on the point is more than the outflow. Given a continuous differentiable vector field , a curl of the vector field is computed as following
(4) 
The curl is a vector with a limited length and a direction. The length of the vector shows the extent to which the vector field rotates and its direction specifies if the rotation is clockwise or counterclockwise around the vector using righthand rule.
4. Method
In this section, we propose FieldE, a new KGE model based on Ordinary Differential Equations (ODEs). The formulation of FieldE is presented in five folds: relation formulation, entity representation, triple representation, plausibility measurement, and vector field parameterization which are discussed in the remainder of this section.
Relation Formulation
FieldE represents each relation in a KG as a vector field () on a Riemannian manifold. denotes the parameters of the function and is constant by time. If we assume, is a parametric trajectory that evolves by the changes of parameter , the following ODE can be defined per each relation of the KG:
(5) 
Given the above formulation, each relation of a KG forms a relationspecific geometric shape. This is consistent with the nature of KGs where different relations form different motifs and patterns in the graph.
Entity Representation
We represent each entity of a KG as a dimensional vector i.e. . The corresponding vectors are embedded on a trajectory of an ODE with an underlying relationspecific vector field. In other words, a sequence of entities (nodes) which are connected by a specific relation, are embedded sequentially through a trajectory laid on a relationspecific Riemannian surface.
Triple Learning
In order to formulate the steps for learning triples by FieldE, let and be the two subsequent nodes (entities shown in the upper part of the Figure 2) of a graph connected by a relation . This actually forms as a triple in a directed graph for each sequentially connected entities with relation . We consider as the embedding vectors for the subsequent entities and (shown in the lower part of the Figure 2). Therefore, a triple in the vetor space can be modeled by time discretization (i.e. ) over the equation 5 where is replaced by entity embeddings (). This gives the following equation
(6) 
where the time step is set to 1 i.e. This is the first order approximation of the exponential map where . Therefore, Equation 6 shows how to move on the trajectory of a relationspecific manifold to take the shortest path from the node with initial direction . Note that the initial direction is dependent on the current node and the relation .
For a given positive triple in the graph, ( is shown by for simplicity from now onward), FieldE learns the triple by optimizing the embedding vectors i.e. as well as the relationspecific vector field function parameters i.e. to fulfill
(7) 
In order to regularize the current point and the length of the step (velocity) , we propose the following formulation instead of the Equation 7. Therefore, for a positive triple FieldE computes
(8) 
Consequently, for a given negative triple the following inequality should be fulfilled by FieldE
(9) 
Plausibility Measurement
Given a triple , the plausibility of the triple is measured by
(10) 
for distancebased version of our model, DFieldE, and by
(11) 
for the semanticmatching version of our embedding model, SFieldE where .
Vector Field Parameterization
The selection of the function is important as it determines the shape of the manifold as well as the shape of the underlying vector field. In this paper, we propose two approaches for determining the vector field: a) we parameterize the vector field function by a neural network and propose a neurodifferential KGE models, b) we additionally propose linear version of our model where the vector field is modeled as a linear function. Here we explain the two steps in detail: NeuroFieldE: Here we parameterize the vector field by a multilayer feedforward neural network to approximate the underlying vector field
(12) 
where L is the number of hidden layer, denotes the output weight of the network and is the weight connecting the th node of the layer to the th node of the th layer.
Parametrizing the vector field with a neural network gives the model enough flexibility to learn various shape vector fields (representing complex geometry) from complex data. This is due to the advantage neural networks which are universal approximators ((Hornik et al., 1989; Hornik, 1991; Nayyeri et al., 2017)) i.e. neural networks are capable of approximating any continuous function on a compact set. LinearFieldE: Linear ODEs are a class of differential equations which have been widely used for several applications (cite?). Here we model the vector field as a linear function
(13) 
where is a
matrix. Depending the eigenvalues of
, the vector field gets various shapes (cite?). Below we present the theoretical analysis of FieldE and its advantage over other stateoftheart KGE models.4.1. Theoretical Analysis
In this part, we theoretically analyse the advantages of the core formulation of FieldE over other KGE models. We first show that while other existing models such as RotatE, ComplEx face issues while learning on singlerelational complex motifs (such as having path and loop with a single relation), FieldE can easily model such complex structure. Moreover, we show that our model subsumes several popular existing KGEs and consequently inherits their capabilities in learning various wellstudied patterns such as symmetry, inversion, etc.
Flexible Relation Embedding
Most of the existing stateoftheart KGEs such as TransE, RotatE, QuatE, ComplEx etc., consider each relation of the KG as a constant vector to perform an algebraic operation such as translation or rotation. Therefore, the relation is entityindependent with regard to the applied algebraic operation. For example, TransE considers a relation as a constant vector to perform translations as
(14) 
Therefore, a relationspecific transformation (here translation) is performed in the same direction with the same length, regardless of the different entities. This causes an issue on the learning outcome of complex motifs and patterns. To show this, let us consider a loop in a graph with a relation which connects three entities
(15) 
After substituiting the first equation in the second one and comparing the result with the third equation, we conclude that . This is indeed problematic because embedding of all the entities will be the same i.e. different entities are not distinguishable in the geometric space. Now we prove that our model can encode loop without marginal issues.
(16) 
In FieldE, after substituiting the first equation in the second, and again substituting the result in the third equation, we obtain
(17) 
The above equation can be satisfied by FieldE
because neural networks with bounded continuous activation functions are universal approximators and universal classifiers
(Hornik et al., 1989; Hornik, 1991; Nayyeri et al., 2017). Therefore, three points can get the values by a wellspecified neural network to hold the equality.We additionally show that our model can also embed a path structure with other three entities while preserving a loop structure with
(18) 
After substituting the first equation in the second equation, and again substituting the results in the third equation, we have
(19) 
Because are distinct points in the domain of the function fulfilling Equations 17, and 19, there is a neural network that approximates these functions due to the universal approximation ability of the underlying network. Therefore, FieldE can learn two different subgraph structures with the same relation.
The stateoftheart KGE models like TransE, RotatE, ComplEx and QuatE are not capable of learning the above structure because they always model the initial direction of relationspecific movement. This is only to be dependent on the relation and ignore the role of entities in moving to the next node of the graph. Such limitation leads to wrong inferences when the graph contains complex motifs and patterns.
Subsumption Of Existing Models
Here, we prove that FieldE subsumes popular KGE models and consequently inherits their learning power.
Definition 4.1 (from (Kazemi and Poole, 2018)).
A model subsumes a model when any scoring over triples of a KG measured by model can also be obtained by model .
Proposition 4.2 ().
DFieldE subsumes TransE and RotatE. SFieldE subsumes ComplEx and QuatE.
Because FieldE subsumes existing models, it consequently inherits their advantages in learning various patterns including symmetry, and antisymmetry, transitivity, inversion and composition. Moreover, because ComplEx is fully expressive and it is subsumed by NeuroSFieldE, we conclude that NeuroSFieldE is also fully expressive. Beside modeling common patterns, FieldE is capable of learning more complex patterns and motifs such as having various motifs on a single relation e.g. having loop and path with one relation.
Proof.
Here we prove that DFieldE subsumes TransE. The FieldE assumption is
If we set (constant vector field), then we have which is the assumption of the TransE model for triple learning. ∎
Proof.
We now prove that DFieldE subsumes RotatE. The RotatE assumption is
(20) 
where entities and relations are complex vectors and the modulus of each dimension of the relation vector is 1 i.e. . In the vector form, the equation 20 can be written in real (rotation) matrixvector multiplication as following
where is a rotation matrix and represents the vector representation of complex numbers (with two components of real and imaginary). Given the assumption of DFieldE i.e. and setting and , the assumption of RotatE is obtained. We conclude that, the RotatE model is a special case of DFieldE. ∎
Proof.
Here we present the proof of subsumption of ComplEx model. The SFieldE uses the following score function
After setting we have
(21) 
Now let us focus on the score function of ComplEx which is
(22) 
We represent the above equation in vectored version of complex numbers as following
(23) 
We can see if in Equation 21, we obtain the score of the ComplEx model in the vectorized in Equation 23. Therefore, ComplEx is also a special case of SFieldE.
∎
Proof.
Here we show that SFieldE subsumes QuatE. QuatE uses the following formulae for the score function
(24) 
where show the Hamilton product and elementwise product between two quaternion vectors. Similarly to RotatE, the Equation 24 can be written in matrix vector multiplication as follows
(25) 
where is a matrix and is a vectorized version of quaternion numbers.
Here, we show that the above equation can be constructed by the score function of SFieldE which is
After setting we have
(26) 
which will be same as the score function of QuatE in vectorized form if the vector Field is set to Therefore, SFieldE subsumes QuatE model, as well. ∎
5. Experiments and Results
In this section, we provide the results of evaluating FieldE ’s performance in comparison to already existing stateoftheart embedding models. With a systematic analysis, we selected a list of KGEs to compare our model with, the list includes TransE, RotatE, TuckEr, ComplEx, QuatE, Dismult, ConvE, and MuRP.
5.1. Experimental Setup
Evaluation Metrics
We consider the standard metrics for compassion in KGEs namely Mean Reciprocal Rank (MRR), and Hits@n (). MRR is measured by , where is the rank of the th test triple and  the number of triples in the test set. Hits@n is the number of testing triples which are ranked less than n, where n can be 1, 3, and 10.
Datasets
We run the experiments on two standard datasets namely FB15k237
(Toutanova and Chen, 2015), and YAGO310 (Mahdisoltani et al., 2013). Statistics of these datasets including the number of their entities and relations as well as the split of train, test, and validation sets are shown in Table 1.Dataset  #Ent.  #Rel.  #Train  #Valid.  #Test 

YAGO310  123k  37  1m  5k  5k 
FB15k237  15k  237  272k  20k  18k 
Hyperparameter Search
We implemented our models in the Python using PyTorch library. We used Adam as the optimizer and tune the hyperparameters based on validation set. The learning rate (
) and batch size () are tuned on the set , respectively. The embedding dimension is fixed to for YAGO310 and 1000 for FB15k237. We set the number of negative sample to for B15K237 and for YAGO310, and used adversarial negative sampling for our model as well as the other models we have reimplemented. We presented two versions of FieldE namely DFieldE and SFieldE. DfieldE uses distance function to compute the score of a triple (see equation 10). On the other SfieldE uses inner product for score computation 11. Each of the above version of FieldE can either used Neural Network to approximate the vector field or use an explicit linear function as a vector field. For the neural network based FieldE we add the prefix ”N” to the beginning of the name of our model (either NDFieldE or NSFieldE). For linear version of FieldE, we use ”L” as prefix in the name of the model (either LDFieldE or LSFieldE). For the Neural version of FieldE, we used a neural network with two hidden layers with hidden nodes for YAGO310 and for FB15K237. We fixed the parameter to in equations 10 and 11. The details of the optimal hyperparameters are reported in the Table 2.Dataset  dimension.  learning rate.  batch size.  hidden nodes.  active function  neg.sample 

YAGO310  100  0.002  512  (500,100)  tanh  500 
FB15k237  1000  0.1  1024  (100,100)  tanh  100 
5.2. Results
The results are shown in Table 3 and the illustrations are depicted in Figure 3, and Figure 4. We first report the performance comparison of FieldE and other models. On both of the datasets, FieldE outperforms all the other models on all the metrics. On the FB15k237 dataset, except MRR achieved equally by the Tucker model, all the other models fall short in comparison to FieldE. FieldE also outperform all the models in MRR on YAGO310.
Model  FB15k237  YAGO310  

MRR  Hits@1  Hits@3  Hits@10  MRR  Hits@1  Hits@3  Hits@10  
TransE  0.33  0.23  0.37  0.53  0.49  0.39  0.56  0.67 
RotatE  0.34  0.24  0.37  0.53  0.49  0.40  0.55  0.67 
TuckEr  0.36  0.26  0.39  0.54         
ComplEx  0.32      0.51  0.36  0.26  0.40  0.55 
QuatE  0.31  0.23  0.34  0.49         
Dismult  0.24  0.15  0.26  0.42  0.34  0.24  0.28  0.54 
ConvE  0.34  0.24  0.37  0.50  0.44  0.35  0.49  0.62 
MuRP  0.34  0.25  0.37  0.52  0.35  0.25  0.40  0.57 
FieldE  0.36  0.27  0.39  0.55  0.51  0.41  0.58  0.68 
Following our motivating example which was focused with motifs created by the influences relation in the YAGO310 knowledge graph, we provide sample visualizations for vector fields of this relation in Figure 3. These visualizations have been selected among hundreds to only give an impression about the capability of FieldE model in motif learning. In order to provide a presentable illustration, we plot each vector fields in pair of dimensions. Therefore, for FieldE with , we created pairs among which we selected six graphs constructed from dimension .
In subfigure 2(a), the captured vector fields are shaping both as loop and path motifs simultaneously. It explains exactly the case illustrated in Figure 2 where some people are influencing others in a loop structure, and some people influence others in a path structure (without a return link). This shows a full structure preservation from the graph representation to the vector representation. As discussed before, this capability also avoids wrong inferences. Subgraph 2(b) shows trajectories of some people being a source influencer for many others. The subfigure 2(c) is another loop and path occurrence with more density.
A set of vector fields with a lot of loops is illustrated in subfigure 2(d). The interpretation of this vector field is that, there are a series of different people influencing each other in a loop structure with different number of entities. The subfigure 2(e) shows a set of sink nodes where they have been influenced by many. And finally, the subfigure 2(f) shows some more dense source entities. Overall, these illustrations doubleprove the capability of FieldE inherited from ODEs and facilitated by the concept of vector field and trajectories.
The visualizations in Figure 4 represent the subgraphs with different motifs including path and loop in different relations. Each row corresponds to the illustrations of one relation for which three different learned structures are selected to be shown. For example, subgraphs of 3(a), 3(b), and 3(c) correspond to the “isconnectedto” relation that shows which airports are connected to each other in different structures (loop and path). Our visualizations capture different learned motifs including path, and loop which show some airports are connected in a loop form and some not. In the subgraphs of 3(d), 3(e), 3(f), different motifs of “hasGender” relation are captured. Same for the “livesIn” relation, we show different illustrations of the vector fields in 3(j), 3(k), and 3(l). By all of these illustration, we aim at giving clarity on the effect of ODEs in learning vector fields which avoids wrong inferences. All of these are trajectories lain on relationspecific Reimanian manifold learned by the neural network of our model. The arrows shows the direction of motif evolution in the vector space for each shape.
6. Conclusion
This work presented a novel embedding models FieldE which is designed based on Ordinary Differential Equations. Since it inherits the characteristics of ODEs, it is capable of encoding different semantical and structural complexities in knowledge graphs. We modeled relations as vector fields on a Rimannian manifold and the entities of a knowledge graph which are connected through the considered relation, are taken as points on the trajectories laid on the manifold. We specifically parameterize the vector field by a neural network to learn the underlying geometry from the training graph. We examined FieldE on several datasets and compared it with a selection of best stateoftheart embedding models. FieldE majorly outperforms all the models in all the metrics. We focused on showing the motif learning for loop and path simultaneously and preserving their structures from the graph representation to the vector representation. We showed that the neural network of FieldE learns various shapes of the vector fields and consequently the underlying geometry. In future versions of this, we plan to apply it on real world knowledge graph beside YAGO and FreeBase and further explore the effect of ODEs in learning process of knowledge graph embedding models.
References
 (1)
 Balazevic et al. (2019) Ivana Balazevic, Carl Allen, and Timothy Hospedales. 2019. Multirelational Poincaré graph embeddings. In Advances in Neural Information Processing Systems. 4465–4475.
 Bordes et al. (2013a) Antoine Bordes, Nicolas Usunier, Alberto GarciaDuran, Jason Weston, and Oksana Yakhnenko. 2013a. Translating embeddings for modeling multirelational data. (2013), 2787–2795.
 Bordes et al. (2013b) Antoine Bordes, Nicolas Usunier, Alberto GarciaDuran, Jason Weston, and Oksana Yakhnenko. 2013b. Translating embeddings for modeling multirelational data. In Advances in neural information processing systems. 2787–2795.
 Bose et al. (2020) Avishek Joey Bose, Ariella Smofsky, Renjie Liao, Prakash Panangaden, and William L Hamilton. 2020. Latent Variable Modelling with Hyperbolic Normalizing Flows. arXiv preprint arXiv:2002.06336 (2020).
 Chamberlain et al. (2017) Benjamin Paul Chamberlain, James Clough, and Marc Peter Deisenroth. 2017. Neural embeddings of graphs in hyperbolic space. arXiv preprint arXiv:1705.10359 (2017).
 Chen et al. (2018) Ricky TQ Chen, Yulia Rubanova, Jesse Bettencourt, and David K Duvenaud. 2018. Neural ordinary differential equations. In Advances in neural information processing systems. 6571–6583.
 Franciscus (2011) M.J. Franciscus. 2011. Riemannian Geometry. International Book Market Service Limited. https://books.google.de/books?id=n6U_lwEACAAJ
 Hornik (1991) Kurt Hornik. 1991. Approximation capabilities of multilayer feedforward networks. Neural networks 4, 2 (1991), 251–257.
 Hornik et al. (1989) Kurt Hornik, Maxwell Stinchcombe, Halbert White, et al. 1989. Multilayer feedforward networks are universal approximators. Neural networks 2, 5 (1989), 359–366.

Ji
et al. (2015)
Guoliang Ji, Shizhu He,
Liheng Xu, Kang Liu, and
Jun Zhao. 2015.
Knowledge graph embedding via dynamic mapping
matrix. In
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
. 687–696. 
Ji
et al. (2016)
Guoliang Ji, Kang Liu,
Shizhu He, and Jun Zhao.
2016.
Knowledge graph completion with adaptive sparse
transfer matrix. In
Thirtieth AAAI conference on artificial intelligence
.  Kazemi and Poole (2018) Seyed Mehran Kazemi and David Poole. 2018. Simple embedding for link prediction in knowledge graphs. In Advances in neural information processing systems. 4284–4295.
 Lin et al. (2015) Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, and Xuan Zhu. 2015. Learning entity and relation embeddings for knowledge graph completion. In Twentyninth AAAI conference on artificial intelligence.
 Mahdisoltani et al. (2013) Farzaneh Mahdisoltani, Joanna Biega, and Fabian M Suchanek. 2013. Yago3: A knowledge base from multilingual wikipedias.
 Mathieu and Nickel (2020) Emile Mathieu and Maximilian Nickel. 2020. Riemannian Continuous Normalizing Flows. arXiv preprint arXiv:2006.10605 (2020).
 Nayyeri et al. (2017) Mojtaba Nayyeri, Hadi Sadoghi Yazdi, Alaleh Maskooki, and Modjtaba Rouhani. 2017. Universal approximation by using the correntropy objective function. IEEE transactions on neural networks and learning systems 29, 9 (2017), 4515–4521.
 Nickel et al. (2011) Maximilian Nickel, Volker Tresp, and HansPeter Kriegel. 2011. A ThreeWay Model for Collective Learning on MultiRelational Data. 11 (2011), 809–816.
 Sun et al. (2019) Zhiqing Sun, ZhiHong Deng, JianYun Nie, and Jian Tang. 2019. Rotate: Knowledge graph embedding by relational rotation in complex space. arXiv preprint arXiv:1902.10197 (2019).
 Suzuki et al. (2018) Atsushi Suzuki, Yosuke Enokida, and Kenji Yamanishi. 2018. Riemannian TransE: Multirelational Graph Embedding in NonEuclidean Space. (2018).
 Toutanova and Chen (2015) Kristina Toutanova and Danqi Chen. 2015. Observed versus latent features for knowledge base and text inference. In Proceedings of the 3rd Workshop on Continuous Vector Space Models and their Compositionality. 57–66.

Trouillon et al. (2016)
Théo Trouillon,
Johannes Welbl, Sebastian Riedel,
Éric Gaussier, and Guillaume
Bouchard. 2016.
Complex embeddings for simple link prediction. In
International Conference on Machine Learning
. 2071–2080.  Wang et al. (2017) Quan Wang, Zhendong Mao, Bin Wang, and Li Guo. 2017. Knowledge graph embedding: A survey of approaches and applications. IEEE Transactions on Knowledge and Data Engineering 29, 12 (2017), 2724–2743.

Wang
et al. (2014)
Zhen Wang, Jianwen Zhang,
Jianlin Feng, and Zheng Chen.
2014.
Knowledge graph embedding by translating on hyperplanes. In
TwentyEighth AAAI conference on artificial intelligence.  Yang et al. (2015) Bishan Yang, Wentau Yih, Xiaodong He, Jianfeng Gao, and Li Deng. 2015. Embedding entities and relations for learning and inference in knowledge bases. In Conference on Learning Representations (ICLR).
 Zhang et al. (2019a) Shuai Zhang, Yi Tay, Lina Yao, and Qi Liu. 2019a. Quaternion Knowledge Graph Embedding. arXiv preprint arXiv:1904.10281 (2019).
 Zhang et al. (2019b) Shuai Zhang, Yi Tay, Lina Yao, and Qi Liu. 2019b. Quaternion knowledge graph embeddings. In Advances in Neural Information Processing Systems. 2731–2741.