1 Introduction
Knowledge Graphs (KGs) i.e., graph structured knowledge bases, store information as entities and the relationships between them, often following some schema or ontology [3]. With the advent of Linked Open Data [4], DBpedia [2], and Google Knowledge Graph^{1}^{1}1https://www.blog.google/products/search/introducingknowledgegraphthingsnot/
, largescale KGs have drawn a lot of attention and have become important data sources for many Artificial Intelligence (AI) and Machine Learning (ML) tasks
[13, 17]. As AI and ML algorithms work with propositional representation of data (i.e., feature vectors) [18], several adaptations of language modeling approaches such as word2vec [11, 12] and GloVe [15] have been proposed for generating graph embedding for entities in a KG. As a first step for such approaches, a “representative” neighborhood for each target entity in the KG must be acquired. To accomplish this task, approaches based on biased random walks [5, 7] have been proposed. These approaches use weighting schemes to make certain edges or nodes more likely to be included in the extracted subgraphs than others. Weighting schemes based on metrics such as frequency or PageRank [5, 21] tend to favor inclusion of “popular” (or densely connected) nodes in the “representative” subgraphs. This can sometimes lead to inclusion of semantically less relevant nodes and edges in the “representative” subgraphs of target entities [20]. We assert that the “representative” neighborhoods for different types of entities (e.g., book, movie, athlete) in crossdomain KGs, such as DBpedia, may comprise distinct sets of characteristic relationships. Our objective is to automatically identify these relationships and use them to extract entityspecific representations. This is in contrast to the scenario where extracted representations are KGspecific because of inclusion of popular nodes and edges, irrespective of their semantic relevance to the target entities. Additionally, we want to identify the most relevant neighborhood of a target entity without venturing into “unrelated” neighborhoods of closeby entities. For example, when identifying most relevant representation for a film, the director’s name should be more likely to be included in the identified representation than his year or place of birth.To address this challenge, we propose specificity as an accurate measure for assigning weights to those semantic relationships which constitute the most intuitive and interpretable representations for a given set or type of entities. We provide a scalable method of computing specificity for semantic relationships of any depth in large scale KGs. We show that specificitybased biased random walks can extract more compact entity representations as compared to the stateoftheart. To further demonstrate the efficacy of our specificitybased approach, we train neural language models (SkipGram [11]) for generating graph embedding from the extracted entityspecific representations and use the generated embedding for the information retrieval task of entity recommendation.
The rest of this paper is structured as follows. In Section 2, we provide a brief overview of related work. In Section 3, we provide the necessary background and then motivate and introduce the concept of specificity. In Section 4, we present a scalable method for computing specificity. In Section 5, we present results highlighting beneficial characteristics of specificity using DBpedia dataset. In Section 6 we present the conclusion and possible directions for future work.
2 Related Work
Graph Embedding Techniques: Numerous techniques have been proposed for generating appropriate representations of KGs for AI and ML tasks. Graph kernel based approaches simultaneously transverse the neighborhoods of a pair of entities in the graph to compute kernel functions based on metrics such as number of common substructures (e.g., paths or trees) [10, 23] or graphlets [19]. Neural language models such as word2vec [11, 12] and GloVe [15], originally proposed for generating word embedding, have been adapted for KGs. Deep Graph Kernel [25] identifies graph substructures (graphlets) and uses neural language models to compute a similarity matrix between different identified substructures. For large scale KGs, embedding techniques based on random walks have been proposed. DeepWalk [16] learns graph embedding for nodes in the graph using neural language models while generating truncated uniform random walks. node2vec [7] is a more generic approach than DeepWalk and uses order biased random walks for generating graph embedding, preserving roles and community memberships of nodes. RDF2Vec [18], an extension of DeepWalk and Deep Graph Kernel, uses BFSbased random walks for extracting subgraphs from RDF graphs, which are converted into feature vectors using word2vec [11, 12]. RDF2Vec has been shown to outperform graph kernelbased approaches in terms of scalability and suitability for ML tasks for large scale KGs, such as DBpedia. The problem with uniform (or unbiased) random walks is the lack of control over explored neighborhood. To address this, biased random walks based approaches [2, 5, 7] have been proposed which use different weighting schemes for nodes and edges. The weights create the bias by making certain nodes or edges more likely to be visited during random walks. Biased RDF2Vec [5] uses frequency, degree, and PageRankbased metrics for weighting schemes. Our proposed approach is closer to RDF2Vec [5, 18] in terms of extracting entity representations and using neural language model for generating embedding. The main difference is that we use our proposed metric of specificity as an edge and pathweighting scheme for biased random walks for identifying most relevant substructures for extracting entity representations from KGs.
Semantic Similarity and Relatedness: Semantic similarity and relatedness between two entities have been relatively well explored [1, 6, 8, 14]. Searching for similar or related entities given a search entity is a common task in information retrieval. However, before developing such functionality, it is important to define the notion of entity similarity and the set of attributes that will be used for its computation. Semantic similarity and relatedness are often used interchangeably in literature [1, 14], where similarity between two entities is computed based on common paths between them. This definition allows computation of similarity between any two entities, including entities of different types. For this paper, we assert that entities of different types carry different semantic meanings, whereas our objective is to automatically identify semantic relationships that constitute the representative neighborhoods of entities of the same given type. Therefore, we limit the computation of similarity to be between two entities of same type. PathSim [20] is one of the approaches proposed for searching for similar entities in heterogeneous information networks. This approach is based on userdefined meta paths (i.e., sequence of relationships between entities) connecting entities of the same type. In contrast, our objective is to automatically identify the most relevant paths using specificity.
3 Specificity: An Intuitive Relevance Metric
In this section, we introduce and motivate the use of specificity as a novel metric for quantifying relevance.
3.1 Preliminaries
An RDF graph is represented by a knowledge base of triples [22]. A triple consists of three parts: subject (s), predicate (p), object (o).
Definition 1
RDF Graphs: Assuming that there is a set of Uniform Resource Identifiers (URIs), a set of blank nodes, a set of literals, a set of object properties, and a set of datatype properties, an RDF graph can be represented as a set of triples such that:
In this paper, we simply represent an RDF graph as such that and , where is a set of directed labeled edges.
Definition 2
Semantic Relationship: A semantic relationship in an RDF graph can be defined as where can be a single predicate or a path represented by successive predicates and intermediate nodes between and .
For this paper, we define semantic relationship of depth or length , as a template for a path (or a walk) in , that comprises of successive predicates . Thus, represents all paths (or walks) between any two entities and that traverse through intermediate nodes, using the same successive predicates that constitute .
Definition 3
RDF Graph Walks: Given a graph , a single graph walk of depth starting from a node , comprise of a sequence of edges (predicates) and nodes (excluding ): .
Random graph walks provide a scalable method of extracting entity representations from large scale KGs [7]. Starting from a node , in the first iteration, a set of randomly selected outgoing edges are explored to get a set of nodes at depth 1. In the second iteration, from every , outgoing edges are randomly selected for exploring next set of nodes at depth 2. This is repeated until a set of nodes at depth is explored. The generated random walks are the union of explored triples during each of the
iterations. This simple scheme of random walks, defined here, resembles a randomized breadthfirst search. In literature both breadthfirst and depthfirst search strategies and an interpolation between the two have been proposed for extracting entity representations from large scale KGs
[7, 16, 18].3.2 Specificity
Consider the following example:
Example 1
Starting from the entity Batman (1989) in DBpedia, a random walk explores following semantic relationships^{2}^{2}2Descriptive names used for brevity instead of actual URIs.:
Our intuition suggests that the style of a director (represented by Gothic Films) is more relevant to a film than his year and place of birth. Frequency, degree, or PageRankbased metrics of assigning relevance may assign higher scores to nodes representing broader categories or locations. For example, PageRank scores (nonnormalized) computed for DBpedia entities Gothic Films, 1958births, and Burbank, CA are 0.586402, 161.258, and 57.1176 respectively^{3}^{3}3http://people.aifb.kit.edu/ath/#DBpedia_PageRank. PageRankbased biased random walks may include these popular nodes and exclude intuitively more relevant information related to the target entity. Our objective is to develop a metric that assigns higher score to relevant nodes and edges in such a way that the node Gothic Films becomes more likely to be captured, for Batman (1989), than 1958 births and Burbank, CA. This way, the proposed metric will capture our intuition behind identifying more relevant information in terms of its specificity to the target entity. To quantify this relevance based on specificity, we determine if Gothic Films represents information that is “specific” to Batman (1989). We trace all paths of depth reaching Gothic Films and compute the ratio of number of those paths that originate from Batman (1989) to number of all traced paths. This gives specificity of Gothic Films to Batman (1989) as a score between 0.01.0. A specificity score of 1.0 means that all paths of depth reaching Gothic Films have originated from Batman (1989). For , this nodetonode specificity of a node to , such that , and being any arbitrary path, can be defined as:
(1) 
For the objective of using specificity for extracting relevant subgraphs, instead of defining specificity as a metric of relevance between each pair of entities (or nodes), we make two simplifying assumptions. First, we assert that each class or type of entities (e.g. movies, books, athletes, politicians) has a distinct set of characteristic semantic relationships. This enables us to compute specificity as a metric of relevance of a node (Gothic Films) to a class or type of entities (Film), instead of each individual instance of that class (e.g. Batman (1989)). Second, we measure specificity of a semantic relationship (director,knownFor), instead of an entity (Gothic Films), to the class of target entities. Here, we are making the assumption that if majority of the entities (nodes) reachable via a given semantic relationship represents entityspecific information, we consider that semantic relationship to be highly specific to the given class of target entities. From our example, this means that instead of measuring specificity of Gothic Films to Batman (1989), we measure specificity of semantic relationship director,knownFor to the class or entity type Film. Based on these assumptions, we can define specificity as:
Definition 4
Specificity: Given an RDF graph , a semantic relationship of depth , and a set of all entities of type , let be the set of all nodes reachable from via . We define the specificity of to as
(2) 
represents any arbitrary semantic relationship of length . All have a common associated type , which means that . Therefore, henceforth, we use the term instead of for denoting specificity.
4 Bidirectional Random Walks for Computing Specificity
Computing Equation 2 requires accessing large parts of the knowledge graph. In this section, we present an approach that uses bidirectional random walks to compute specificity. To understand, consider an entity type and a semantic relationship , for which we want to compute . We start with a set containing a small number of randomly selected nodes of type . From nodes in , forward random walks via are performed to collect a set of nodes (ignoring intermediate nodes, for ). From nodes in set , reverse random walks in (or forward random walks in ) are performed using arbitrary paths of length
to determine the probability of reaching any node of type
. Specificity is computed as number of times a reverse walk lands on a node of type divided by total number of walks. This idea is the basis for the algorithm presented next which, for a given type in , builds a list of most relevant semantic relationships up to depth sorted by their specificity to .4.1 Algorithm
The function rankBySpecificity in Algorithm 1 performs initialization of variables and builds a set of semantic relationships for which specificity is to be computed. and hold the set of semantic relationships, unsorted and sorted by specificity respectively. is initialized as an array of size to hold sorted semantic relationships for every depth up to . specify the size of . is the number of bidirectional walks performed for computing specificity for each semantic relationship in . A set of randomly selected nodes of type is generated in line 3. For each iteration (), a set of semantic relationships is selected in line 5. The function computeSpecificity, in line 6, computes specificity for each semantic relationship in and returns results in . Each element of is an array of dictionaries. Each dictionary contains pairs sorted by , where is the semantic relationship and is its specificity. For each iteration of for in Algorithm 1, can be populated from scratch with semantic relationships of depth by random sampling of outgoing paths from . Alternatively, for iterations , can be populated by expanding from most specific semantic relationships in . In our implementation of the algorithm, we use for populating in iteration.
Algorithm 2 shows the function computeSpecificity which computes specificity for a given set of semantic relationships in ( from Algorithm 1). In lines 6 and 7, for each semantic relationship , a node is randomly selected to get a node reachable from via in (forward walk). In line 8, using and by selecting an arbitrary path of depth , a node is reached (reverse walk^{4}^{4}4Reverse walk in or forward walk in ). If is one of the types associated with , variable count is incremented in line 10. This process is repeated times for each , where . At line 13, is computed as . Lines 413 are repeated until has been computed for each . The return variable contains semantic relationships and their specificity scores as pairs.
4.2 Biased RDF Graph Walks with Pruning
Once, list of most relevant semantic relationships based on specificity is generated, we use it to create specificitybased biased random walks for extracting representative subgraphs for target entities. In order to further improve subgraph extraction, we outline some pruning schemes for the biased random walks.
4.2.1 Nonrepeating Starting Entity (NRSE)
We start with the scheme that has least restrictive criteria for inclusion of entities in the graph walk of depth . Starting from in (Definition 3), we generate random graph walks of depth . If is observed at depth such that , the graph walk is discarded.
4.2.2 Unique Entities (UE)
This scheme is more restrictive than NRSE and does not allow repetition of any node in a single graph walk i.e., all nodes traversed in a single graph walk must be unique, resulting in a path.
4.2.3 Nonrepeating Starting Entity Type (NRST)
Assume that and are two entities of same type in an RDF graph and are connected by a path of depth . A graph walk of depth from , such that , may also traverse attributes specific to . To avoid this, this pruning strategy discards graph walks if an entity of type same as that of the starting entity is encountered at depth .
4.2.4 Unique Entity Types (UET)
In this scheme, we consider walks that have no two entities with the same type appearing in a single graph walk, making it the most restrictive pruning scheme out of the four.
5 Evaluation
We evaluate our approach on four different criteria. First, we analyze the behavior of specificity computed for most relevant semantic relationships up to depth 4. Second, we study sensitivity of specificity computations on the parameters and (from Algorithm 1). Third, we evaluate the computation time and size of the extracted subgraphs by our proposed specificitybased biased random walk schemes against the baselines. Fourth, we evaluate our approach on the task of entity recommendation.
5.1 Datasets
We evaluate our approach on DBpedia which is one of the largest RDF repositories publicly available [9]. We use the English version of DBpedia dataset from 201604^{5}^{5}5http://wiki.dbpedia.org/dbpediaversion201604. We create graph embedding for 3,000 entities of type http://dbpedia.org/ontology/Film (dbo:Film for short)^{6}^{6}6Results based on dbo:Film and more entity types can be found at https://github.com/mrizwansaeed/Specificity.
5.2 Experimental Setup and Results
The first step is to compute specificity to find the set of most relevant semantic relationships with respect to entities of type dbo:Film. The two main parameters used in the algorithm for computing specificity are size of seed set and number of bidirectional walks . We set these two parameters to 300 and 2000 respectively. These values come from the experiments in which we choose different values of the parameters to study their effect on specificity computation (Section 5.2.2). We randomly sample semantic relationships originating from entities in and select top semantic relationships based on frequency of occurrence, for each depth . For , we use the most specific semantic relationships from depth to get a list of semantic relationships of depth for computing specificity^{7}^{7}7This is for populating in Algorithm 1.. We only consider those semantic relationships as relevant that have specificity greater than 50%, which are then used for creating biased random walks for subgraph extraction. Data and models used to generate the results are available online^{8}^{8}8See footnote 6..
5.2.1 Specificity as a Metric for Measuring Relevance
Figure 1 shows the top 25 semantic relationships sorted by their frequency and specificity for depths up to 4. Figure (a)a, resembling power law curves, shows long tailed behavior for frequency of top occurring semantic relationships. As depth increases, frequency exhibits a flattened behavior due to rapid increase in number of possible semantic relationships at each depth . Specificity of top25 semantic relationships is shown in Figure (b)b with the threshold of specificity drawn at 50%. The specificity of a semantic relationship, here, is the probability of reaching any node of type dbo:Film from a set of nodes ( in Definition 4) by reverse walks in (or forward walks in ) using any arbitrary path of length . This suggests that as increases, specificity of semantic relationships of length decreases. This can be seen in Figure (b)b where, for every depth between 1 and 4, specificity is showing a more distinctive diminishing behavior as compared to frequency. This behavior is analogous to using a decaying factor, usually a function of depth [14], that is used to assign low relevance to nodes farther away from target nodes. It can also be seen in Figure (b)b that there are multiple instances of semantic relationships of length that have higher specificity than semantic relationships of lengths less than . This indicates that specificitybased biased random walks allow both shallow (breadthfirst) and deep (depthfirst) exploration of the relevant neighborhoods around target entities. The ability to incorporate decaying behavior and interpolate between shallow and deep exploration allows specificity to quantify relevance of semantic relationships, of different lengths, at a more finegrained level.
Semantic Relationships  Specificity  PageRank[21]  Frequency 

dbo:director,dbo:knownFor  59.14  6.2  345 
dbo:director,dct:subject  1.05  823.53  73752 
dbo:director,dbo:birthPlace  0.03  200.33  7087 
Table 1 shows computed relevance of the three semantic relationships from Example 1 (Section 3.2) based on their specificity, PageRank, and frequency. The given PageRank values in column 3 are the average of nonnormalized PageRank scores [21] of top25 nodes linked to entities of type dbo:Film by corresponding semantic relationship. The values of frequency, in last column, represent the number of occurrences of corresponding semantic relationship in DBpedia dataset. We argued that the semantic relationship dbo:director,dbo:knownFor is more relevant to a film as compared to the other two. It can be seen from Table 1 that the proposed specificity based relevance metric is closer to our intuition as compared to other metrics.
5.2.2 Sensitivity of Specificity to and
The algorithm for computing specificity uses bidirectional random walks, governed by two parameters: number of bidirectional walks () and size of seed set . To understand the effect of these parameters on specificity, we make the assumption that larger values for both parameters lead to better approximation of specificity because of inclusion of more nodes in the computations. We first compute specificity for a range of values for both parameters and then take the computations performed using the largest values of and as ground truth, instead of manually generated list of semantic relationships. To compare the different lists of semantic relationships sorted by specificity, we use NDCG (Normalized Discounted Cumulative Gain) [24].
Figure (a)a shows the trend of NDCG values as varies between 100 and 5000. For depth = 1, steady values of NDCG indicate that a few hundred bidirectional walks are sufficient for finding most relevant semantic relationships. For depths 2, there is an increasing but fluctuating trend of NDCG for larger values of , which becomes more pronounced for depth 3. We believe that the fluctuations come from the fact that as we go further away from the seed set during forward walks (Algorithm 2), the number of possible paths for reverse walks increases substantially. Addressing this requires a larger value of
that allows the algorithm to traverse sufficiently large number of these reverse paths to provide a stable estimation of specificity. However, before selecting such a high value, it should be noted that the execution time of Algorithm
2 is directly affected by parameter . Figure (b)b shows the computation times for specificity for different values of and depth . For , which gives for depths 1 and 2, it takes approximately 200 seconds to find semantic relationships of high specificity. For depth = 4 and , the computation of specificity takes approximately 1200 seconds. However, this computation is only needed to be performed once for each type of target entities in a KG.Figure 3 shows the behavior of changing the size of seed set on NDCG, for . It can be observed that for 300, further increasing the size of seed set does not have significant effect on the order of sorted semantic relationships. This was expected, since the running time of Algorithm 2 only depends on the number of semantic relationships in input list (which controls the outer loop, lines 314) and (which controls the inner loop, lines 512).
5.2.3 Comparison of Size and Computation Time for Extracted Subgraphs
We have chosen RDF2Vec [18] (denoted as in the plots) as one of our baselines. The other baseline, we use, is PageRankbased biased random walks, which we denote as ^{9}^{9}9We use the DBpedia PageRank [21] dataset from http://people.aifb.kit.edu/ath/#DBpedia_PageRank.. We use to denote specificitybased biased random walks without pruning. The specificitybased pruned biased random walks are denoted as , , , and . For the RDF data set, we generate 500 walks (based on [18]) with depths 1, 2, and 3.
Figure (a)a shows that biased random walks are able to extract the representative entity subgraphs with fewer number of walks. This is mainly because of the reason that biased random walks extract subgraphs using a specific extraction template based on semantic relationships with higher specificity. This enables collection of fewer but more relevant nodes and edges than the unbiased random walks. For depth = 1, generates the least number of walks. The DBpedia PageRank dataset [21] only provides PageRank values for entities with URIs. Therefore, explores the KG only through object properties, excluding datatype properties and literals, resulting in fewer number of walks. All biased walks also take fewer time for subgraph extraction, except for and because of the extensive checks needed to perform these walks (Section 4.2). The most restrictive scheme did not generate any walks for depth = 3.
5.2.4 Suitability for Entity Recommendation task
We have shown that the specificitybased biased random walks extract more compact entity representations as compared to unbiased random walks. However, to prove that the compactness of extracted subgraphs is not a disadvantage, we use the graph embedding generated from the extracted substructures for the task of entity recommendation. Using the extracted subgraphs (represented as sequence of labels), we train Skipgram models using the following parameters: dimensions of generated vectors = 500, window size = 10, negative samples = 25, iterations = 5 for each scheme and depth. All models^{10}^{10}10The trained models are available at https://github.com/mrizwansaeed/Specificity for depth are trained using sequences generated for both depths 1 and . The parameters for this experiment are based on RDF2Vec [18].
Figure 5 shows results for six different movies selected for entity recommendation task. The search key for each recommendation task is provided as the caption of the corresponding plot. We have selected movies that are part of movie franchises or series. The ground truth, then simply, consists of other movies in their respective franchises or movie series. For evaluation, we retrieve topk similar movies using the trained word2vec models.
is chosen to be the total number of movies in the ground truth excluding the search key, which makes precision and recall to be the same. Note, that
The Lord of Rings: Fellowship of the Ring (2001) has two sequels but the value of given in caption is 3. This is because of the entity of type dbo:Film in the DBpedia dataset which corresponds to the Wikipedia page about the entire film series^{11}^{11}11https://en.wikipedia.org/wiki/The_Lord_of_the_Rings_(film_series). The results show that specificitybased schemes in majority of cases perform better than the baselines. Even, where our proposed strategies have comparable results (Figure (a)a), it must be noted that this accuracy was achieved using more compact entity representations as compared to (Figure (a)a).6 Conclusion
Graph embedding is an effective method of preparing KGs for AI and ML techniques. However, to generate appropriate representations, it is imperative to identify the most relevant representative subgraphs for target entities. In this paper, we presented specificity as a useful metric for finding the most relevant semantic relationships for target entities of a given type. Our bidirectional random walksbased approach for computing specificity is suitable for large scale KGs of any structure and size. We have shown, through experimental evaluation, that the metric of specificity incorporates a finegrained decaying behavior for semantic relationships. It has the inherent ability to interpolate between the extreme exploration strategies: BFS and DFS. We used specificitybased biased random walks to extract compact representations of target entities for generating graph embedding. These generated representations have better performance with respect to baseline approaches when used for our selected task of entity recommendation. For future work, we will study the effects on other tasks in which specificitybased graph embedding can be used, such as link predictions and classifications in KGs.
Acknowledgment
This work is supported by Chevron Corp. under the joint project, Center for Interactive Smart Oilfield Technologies (CiSoft), at the University of Southern California.
References
 [1] Aggarwal, N., Asooja, K., Ziad, H., Buitelaar, P.: Who are the american vegans related to brad pitt?: Exploring related entities. In: Proceedings of the 24th International Conference on World Wide Web Companion, WWW 2015
 [2] Atzori, M., Dessi, A.: Ranking dbpedia properties. In: 2014 IEEE 23rd International WETICE Conference, WETICE 2014
 [3] BernersLee, T., Hendler, J., Lassila, O.: The semantic web. Scientific american 284(5), 34–43 (2001)
 [4] Bizer, C., Heath, T., BernersLee, T.: Linked datathe story so far. International journal on semantic web and information systems 5(3), 1–22 (2009)
 [5] Cochez, M., Ristoski, P., Ponzetto, S.P., Paulheim, H.: Biased graph walks for RDF graph embeddings. In: Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics, WIMS 2017
 [6] Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using wikipediabased explicit semantic analysis. In: IJCAI 2007, Proceedings of the 20th International Joint Conference on Artificial Intelligence, 2007
 [7] Grover, A., Leskovec, J.: node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2016
 [8] Leal, J.P., Rodrigues, V., Queirós, R.: Computing semantic relatedness using dbpedia. In: 1st Symposium on Languages, Applications and Technologies, SLATE 2012
 [9] Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Hellmann, S., Morsey, M., van Kleef, P., Auer, S., Bizer, C.: Dbpedia  A largescale, multilingual knowledge base extracted from wikipedia. Semantic Web 6(2), 167–195 (2015)
 [10] Lösch, U., Bloehdorn, S., Rettinger, A.: Graph kernels for RDF data. In: The Semantic Web: Research and Applications  9th Extended Semantic Web Conference, ESWC 2012
 [11] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. CoRR abs/1301.3781 (2013)

[12]
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: 27th Annual Conference on Neural Information Processing Systems  NIPS 2013
 [13] Nickel, M., Murphy, K., Tresp, V., Gabrilovich, E.: A review of relational machine learning for knowledge graphs. Proceedings of the IEEE 104(1), 11–33 (2016)
 [14] Paul, C., Rettinger, A., Mogadala, A., Knoblock, C.A., Szekely, P.A.: Efficient graphbased document similarity. In: The Semantic Web. Latest Advances and New Domains  13th International Conference, ESWC 2016

[15]
Pennington, J., Socher, R., Manning, C.D.: Glove: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014
 [16] Perozzi, B., AlRfou, R., Skiena, S.: Deepwalk: online learning of social representations. In: The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2014
 [17] Rettinger, A., Lösch, U., Tresp, V., d’Amato, C., Fanizzi, N.: Mining the semantic web. Data Mining and Knowledge Discovery 24(3), 613–662 (2012)
 [18] Ristoski, P., Paulheim, H.: Rdf2vec: RDF graph embeddings for data mining. In: The Semantic Web  ISWC 2016
 [19] Shervashidze, N., Vishwanathan, S.V.N., Petri, T., Mehlhorn, K., Borgwardt, K.M.: Efficient graphlet kernels for large graph comparison. In: Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics, AISTATS 2009
 [20] Sun, Y., Han, J., Yan, X., Yu, P.S., Wu, T.: Pathsim: Meta pathbased topk similarity search in heterogeneous information networks. PVLDB 4(11), 992–1003 (2011)
 [21] Thalhammer, A., Rettinger, A.: PageRank on Wikipedia: Towards General Importance Scores for Entities. In: The Semantic Web: ESWC 2016 Satellite Events, 2016
 [22] Tzitzikas, Y., Lantzaki, C., Zeginis, D.: Blank node matching and RDF/S comparison functions. In: The Semantic Web  ISWC 2012
 [23] de Vries, G.K.D., de Rooij, S.: Substructure counting graph kernels for machine learning from RDF data. J. Web Sem. 35, 71–84 (2015)
 [24] Wang, Y., Wang, L., Li, Y., He, D., Liu, T.: A theoretical analysis of NDCG type ranking measures. In: COLT 2013  The 26th Annual Conference on Learning Theory, 2013
 [25] Yanardag, P., Vishwanathan, S.V.N.: Deep graph kernels. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,
Comments
There are no comments yet.