New Datasets and a Benchmark of Document Network Embedding Methods for Scientific Expert Finding

04/07/2020 ∙ by Robin Brochier, et al. ∙ 0

The scientific literature is growing faster than ever. Finding an expert in a particular scientific domain has never been as hard as today because of the increasing amount of publications and because of the ever growing diversity of expertise fields. To tackle this challenge, automatic expert finding algorithms rely on the vast scientific heterogeneous network to match textual queries with potential expert candidates. In this direction, document network embedding methods seem to be an ideal choice for building representations of the scientific literature. Citation and authorship links contain major complementary information to the textual content of the publications. In this paper, we propose a benchmark for expert finding in document networks by leveraging data extracted from a scientific citation network and three scientific question answer websites. We compare the performances of several algorithms on these different sources of data and further study the applicability of embedding methods on an expert finding task.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Many tools offer to search and filter the vast data sources available on the Web. In particular, there is a multitude of platforms directed to the scientific community. From the simple search engine for publications to the social network for researchers, all consume and produce valuable data for searching scientific content of interest. Expert finding is one the the most challenging problem that finds application in both academia and the industry. To tackle this challenge, recent advances in document network embedding (DNE) has the potential to inspire new unsupervised models that can deal with the heterogeneous network of documents of the scientific literature. However, the design of such efficient algorithms heavily depends on the development of strong evaluation frameworks.

In this paper, we propose a methodology and provide 4 datasets that extend the limited scope of expertise retrieval evaluation frameworks. Furthermore, we provide experiment results computed with unsupervised methods and we extend document network embedding algorithms to this specific task.

Our contributions are the following:

  • we provide 4 datasets for expert finding extracted from a scientific publication network and three question & answer (Q&A) websites and make them publicly available 111https://github.com/brochier/expert_finding;

  • we describe an evaluation methodology based on the ranking of expert candidates given a set of labeled document queries;

  • we report experiment results that give some insights on this expert finding task;

  • we explore and analyze the use of state-of-the-art document network embedding algorithms for expert finding and we show that further research is needed to bridge the gap between DNE methods and expert finding.

The rest of the paper is organized as follows. In Section 2, we survey related works. We detail in Section 3 our evaluation methodology, the datasets we extracted, the evaluation measures and the algorithms we use. In Section 4 we show and analyze the results of our experiments. Finally, in Section 5, we discuss our findings and provide future directions.

2 Related Works

In this section, we first present a formal definition for expert finding. Then we present algorithms of the literature that address expert finding. Finally, we describe recent methods for document network embedding that have the potential to deal with this particular task.

2.1 Formal definition of expert finding

The concept of expert finding can cover a large range of tasks. The main principle behind expertise retrieval is the search for candidates given a query. To match these two, an algorithm will be provided with some data to link the output space, a ranking of candidates, with the input space, which is often a textual content. However, many different types of data can be considered to address this challenge. To fairly compare algorithms, we choose a fixed structure for the data which reflects common use cases. Furthermore, if supervised methods benefit from labeled fields of expertise associated with the candidates, they are beyond the scope of this paper which focuses on unsupervised methods only. Our goal is to compare methods that do not require sometimes costly annotations.

Early works in expert search [7] usually consider a small set of topical queries. The direct namings of these topics are used to retrieve a list of candidates by leveraging a collection of documents they published (e.g., emails, scientific papers). This type of evaluation is used across several public datasets [26, 14, 16].

More recently, the concept of expert finding has been merged into the wider concept of entity retrieval [1]. As more and more complex data are produced on the Web, expert finding becomes a particular application of entity search. At the same time, Q&A websites such as Stack Overflow 222https://stackoverflow.com/ generate and make publicly accessible a big amount of questions with expert answers, collaboratively curated by their users. Several works address the search for experts in such websites [20, 28]. Often, the task consists in either finding the exact list of users who answered a specific question or ranking the answers according to the user votes. In the first case, the task involves considering the evolution of the users across time and, in the second case, the task involves understanding the intrinsic quality of a written answer. Nevertheless, [25] reviews several models for expert finding in Q&A websites. Their experiments show that matrix factorization-based methods perform better than tree based and ranking based methods.

In this paper, we adopt the document-query methodology recently proposed in [3]. The expert search is performed given a set of queries that are particular textual instances of some expert topics (or fields of expertise). Given a query, an algorithm should rank first the candidates that are associated to the same fields of expertise. We provide 4 datasets for which we annotated experts and document queries. Each dataset consists in candidates and documents linked by authorship relations (candidate-document e.g. authorship) and by response relations (document-document e.g citation or answer). A query is therefore one of the documents (e.g. a scientific paper or a question) for which we aim to retrieve some experts of the topics depicted in it. This configuration reflects many real case scenarios such as (1) the automatic search for scientific reviewers, (2) the recommendation of expert users in Q&A websites or even (3) the retrieval of interesting profiles for job offers.

2.2 Algorithms for expert finding

Numerous works have addressed automatic expertise retrieval. We describe here the main approaches and some interesting recent methods. P@noptic Expert [8] creates meta-documents for a candidate by concatenating the contents of all documents she produced. In this manner, ranking the candidates given a query becomes a similarity search between the query representation and the meta-documents representations. A voting model [15] computes the similarities between the query and the documents. The algorithm then aggregates these scores at the candidate level by using a fusion technique such as the reciprocal rank [27]. A propagation model [21] takes advantage of the links between candidates and documents to propagate the similarities between the query and the documents. Using random walks with restart [17], the iterative propagation of the scores converges in a few steps to a stationary distribution over the candidates. WISER [6]

models each candidate as a small, weighted, sub-graph of the Wikipedia Knowledge Graph. Information derived from these graphs and traditional document retrieval techniques are combined to identify experts w.r.t a query. Note that methods leveraging external data are out of the scope of our benchmark. LT Expertfinder is an evaluation framework for expert finding

[11] based on an interactive tool. It integrates various existing algorithms (such as [1]) in a user-friendly way. The underlying corpus used by this tool is the ACL Anthology Network. However, it does not include a well-established ground truth to assess who are the experts. Indeed, the evaluation is purely done in an online manner since the user has to evaluate the degree of expertise based on several features, such as author’s citations, h-index, keywords, etc. Recent works [23, 9] propose ad hoc embedding techniques, whereas, in this work, we’re interested in measuring the performance of conventional network embedding techniques.

2.3 Document network embedding

Network embedding [19, 12]

provides an efficient approach to represent nodes in a low dimensional vector space, suitable for solving various machine learning tasks. Recent techniques extend NE for document networks. Text-Associated DeepWalk (TADW)

[24] extends DeepWalk to deal with textual attributes. Yang et al. prove, following the work in [13], that Skip-Gram with hierarchical softmax can be equivalently formulated as a matrix factorization problem. TADW then consists in constraining the factorization problem with a pre-computed representation of the documents by using Latent Semantic Analysis (LSA) [10]. Graph2Gauss (G2G) [2]

is an approach that embeds each node as a Gaussian distribution instead of a vector. The algorithm is trained by passing node attributes through a non-linear transformation via a deep neural network (encoder). GVNR-t

[4] is a matrix factorization approach for document network embedding, inspired by GloVe [18], that simultaneously learns word, node and document representations by optimizing a least-square objective over a co-occurrence matrix of the nodes constructed by truncated random walks. IDNE [5] introduces a Topic-Word Attention mechanism, trained from the connections of a document network, to represent documents as mixtures of topics.

DNE algorithms do not directly apply to expert finding data since they are not designed to handle multiple types of nodes, in particular candidate nodes. In this paper, we show (1) two methods to extend their applicability to the task of expert finding and (2) the impact of their representations when they are used as document representations for traditional expert finding algorithms.

3 Evaluation Methodology

We present in this section the evaluation methodology that we follow to access the performances of several algorithms for expert finding. We first describe the task we seek to solve, then we describe the datasets that we extracted and explain how we annotated them in order to access the quality of the algorithms’ outputs. Finally, we detail the models used in our experiments.

3.1 Ranking expert candidates from document queries

Expert finding is a complex task that can be formalized in multiple ways. Early works define this task as a ranking problem given several topic-queries where the naming of these topics are directly used as queries to retrieve the expert candidates. However, in many real world applications, a user is asked to provide a specific and detailed query. In a Q&A website for instance, a user usually exposes the problem she faces in full detail and does not necessarily know the exact naming of the fields of expertise needed to solve her problem. Furthermore, querying an algorithm with a small set of topic-queries can lead to poor evaluation measures due to the usually small number of fields of expertise associated with the dataset. For this reasons, we follow the document-query evaluation methodology proposed in [3] by processing 4 datasets for which a set of document-queries is manually annotated.

The expert finding task in this paper is a ranking problem. Given a document labeled with a ground truth set of fields of expertise, an algorithm is queried to rank a set of candidates, among which a subset of experts are associated with the same set of labels. The data provided to the algorithms consists in a corpus of documents , candidates , a network of authorship with adjacency matrix and a network of documents with adjacency matrix . Figure 1 shows an hypothetical dataset used in this paper. The ranking is performed in an unsupervised setting, that is, no ground truth labels of expertise are given to the algorithms. The set of labeled documents (the queries) can be smaller than and the set of labeled candidates (experts) can be smaller than (i.e not all documents and candidates are labeled).

Expertise labels
Figure 1: Hypothetical example of an expert finding dataset we use in this paper. 5 candidates are authors of 6 documents. The 6 documents are connected to each other by citation in a scientific corpus, or by answer in a same post in a Q&A website. Among the candidates, 3 are known to be experts in stars and/or in circles. 4 documents are associated to these 2 fields of expertise as well. In our evaluation methodology, we query an algorithm with these 4 documents and expect a ranking of candidates that will match each document’s fields of expertise. As an example, a perfect algorithm might generate the rankings and .

To evaluate the candidate scores provided by the algorithms, we compare the resulting rankings with the ground truth fields of expertise. If a document is associated with three different labels, we expect the algorithm to rank first all experts associated to at least one of these labels. We report the area under the ROC curve (AUC), the precision at 10 (P@10) and the average precision (AP) and we compute their standard deviation along the queries. That is, we evaluate the robustness of the algorithms against the variety of document-queries.

3.2 Datasets

We consider 4 datasets. The first one is an extract of DBLP [22] in which a list 199 experts in 7 fields are annotated [26] by human judgments 333https://lfs.aminer.cn/lab-datasets/expertfinding/##expert-list. Our dataset only considers the annotated experts and the other candidates that are close in the co-authorship network which explains the relatively small size of our network compared to the original one. In addition to the expert annotations, our evaluation framework requires document annotations since we adopt the document-query methodology for expertise retrieval. We asked two PhD students in computer science to associate independently 20 randomly drawn documents per field of expertise (140 in total). Then, only the labels on which the two annotators agreed were kept, leaving 114 annotated papers. The mean Cohen’s kappa coefficient across the labels is . An advantage of our methodology is that we can evaluate the algorithms on more queries (114 documents) than the traditional method (7 labels). This allows us to assess the robustness of the algorithms by computing the standard deviations of the ranking metrics along all queries. However, one might suggest that these 7 labels do not reflect a representative set of expertise as there are too broad. For this reason, we seek for a wider granularity of expertise by the use of well-know question & answer website.

If scientific publication networks are easy to find on the Web, scientific expertise annotations are rarely available for both authors and publications. We use data downloaded in June 2019 from Stack Exchange 444https://archive.org/details/stackexchange to create datasets for expert finding collected from three communities closely related to research. Academia 555https://academia.stackexchange.com/ is dedicated to academics and higher education. Mathoverflow 666https://mathoverflow.net/ gathers professional mathematicians and is widely used by researchers. Stats 777https://stats.stackexchange.com/ (also known as Cross Validated) addresses statistics, machine learning and data mining issues. For each dataset, we first keep questions with at least 10 user votes that have at least one answer with 10 user votes or more. We build the networks by linking questions with their answers and by linking answers with the users who published them. The field of expertise are the tags associated with the questions. Only the tags that occur at least 50 times are kept. We annotate an expert with the tags of a question if her answer to that question received at least 10 votes. Note that the tags are first provided by the users who ask the questions but they are thereafter verified by experimented users.

The general properties of our 4 datasets are presented in Table 1. The annotations and the preprocessed datasets are made publicly available.

# candidates # documents # labels # experts # queries label example
DBLP 707 1641 7 199 114 ’information_extraction’
Stats 5765 14834 59 5765 3966 ’maximum-likelihood’
Academia 6030 20799 55 6030 4214 ’recommendation-letter’
Mathoverflow 7382 38532 98 7382 10614 ’galois-representations’
Table 1: General properties of the datasets.

3.3 Algorithms

We run the experiments with 4 baseline algorithms and 4 document network embedding algorithms. The laters are adapted with two aggregation schemes in order to deal with the candidates since they are primarily designed for document network only. These aggregations are arbitrary and are voluntarily the most straightforward way to run DNE algorithms on bipartite networks of authors-documents. We further discuss these choices in section 4.

3.3.1 Baselines

We run the experiments with the same models as in [3]

, using the tf-idf representations and the cosine similarity measure. Also, we add a random model to have reference metrics:

  • Random model: we randomly draw scores between 0 and 1 for each candidate;

  • P@noptic model [8]: we concatenate the textual content of each document associated to the candidates, use their tf-idf representations and compute the cosine similarity to produce the scores;

  • Voting model [15]: we use the reciprocal rank to aggregate the scores at the candidate level;

  • Propagation model [21]: we concatenate the two adjacency matrices and to construct a transition matrix between candidates and documents such that

    . The initial scores are the cosine similarities between the tf-idf representations of the query and the documents. The scores are propagated iteratively until convergence with a restart probability of

    .

We also run the voting and propagation models using document representations produced by IDNE in place of the tf-idf vectors. The document network provided to IDNE has adjacency matrix .

3.3.2 Extending DNE algorithms for expert finding

DNE methods usually operate in networks of documents, with no candidate nodes. To apply them in the context of expert finding, we propose two straightforward approaches:

  • pre-aggregation: as in the P@noptic model, meta-documents are generated by aggregating the documents produced by each candidates. Furthermore, an adjacency matrix of a meta-network between candidates and documents is constructed. We compute a candidate network as and a document network as . The meta-network is then . The candidate and document representations are then generated by treating this meta-network and the concatenation of the documents and meta-documents as an ordinary instance of document network. From this meta-network, we generate representations with the DNE algorithms. The scores of the candidates are generated by cosine similarity between the representation of the document-query and the representations of the candidates;

  • post-aggregation: in this setting, we first train the DNE algorithm on the network of documents defined by . Once the representations are generated for all documents, a representation for a candidate is computed by averaging the vectors of all documents associated to her. The scores are then computed by cosine similarity.

We run the experiments with 4 document network embedding algorithms, using the authors’ implementations. For all methods, the dimension of the representations is set to 256:

  • TADW [24]: we follow the original paper by using 20 iterations and a penalty term ;

  • GVNR-t [4]: we use random walks of length , a sliding window of size and a threshold with 4 iterations;

  • Graph2gauss (G2G) [2]

    : we make sure the loss function converges before the maximum number of iterations;

  • IDNE [5]: we run all experiments with topic vectors with 5000 balanced mini-batches of 16 positive samples and 16 negative samples.

4 Experiment Results

Tables 2 to 5 show the experiments results. In the following, we analyze the performances of the aggregation scheme against the baseline algorithms, we highlight the interesting results obtained when using the baselines with pre-computed document representations with a DNE algorithm and finally we make some observations on the differences between the datasets. Note that the implementation of TADW, provided by the authors, could not scale to Mathoverflow.

4.1 Baselines versus DNE aggregation schemes

For all datasets, the propagation model performs generally better than the other algorithms, particularly in terms of precision. Both aggregation schemes yield to poor results and none of these two methods appear to be better than the other. GVNR-t is the best algorithm among the document network embedding models. We believe that, if DNE algorithms are well suited for document network representation learning, the gap between simple tasks such as node classification and link prediction and the task of expert finding is too big for our naive aggregation schemes to perform well. Especially, the network structure changes significantly between an homogeneous network and an heterogeneous network. Moreover, expert finding algorithms often benefit from information about the centrality of the candidates and documents. DNE algorithms do not particularly preserve this information neither do our aggregation schemes.

4.2 Using DNE as document representations for the baselines

Since the baseline algorithms perform well, we study the possibility to apply them using a DNE algorithm for the representations of the documents. We only report the results with the representations computed with IDNE but we observe the same behaviors with other DNE models. First, these representations constantly improve the voting model, which achieves best results in terms of AUC on Stats and Mathoverflow. Then, the most surprising effect is the significant decrease of performance of the propagation model. If the precision for the first ranked candidates is not affected, the AUC score significantly drops for the three Q&A datasets. We believe that document network embeddings captures too long-range dependencies between the documents in the network, which are then subsequently exaggerated by the propagation procedure. Figure 2 shows the effect of the representations used with the propagation model on the ROC curve.

(a) Propagation model with tf-idf representations: the curve has a nice shape which means the ranking of candidates are good even for the last ranked experts.
(b) Propagation model with IDNE representations: the first ranked candidates are good but the algorithm tends to wrongly rank last many true experts.
Figure 2: Effect of IDNE representations on the propagation model. Using document network embeddings significantly damages the rankings.

4.3 Differences between the datasets

The results achieved by the algorithms on all three Stack Exchange datasets are consistent. However, they do not behave the same with DBLP. First, DNE methods get closer scores to the baselines on DBLP. In the Q&A datasets, the interactions are more isolated i.e. there are more users having fewer interactions. This difference of network properties might disadvantage DNE methods who are usually trained on scale-free networks whose degree distribution follows a power law. Moreover, the propagation method does not suffer with DBLP from the decrease of performance induced by the IDNE representations. We hypothesize that the low number of expertise fields associated with this dataset largely reduces the effect described in the previous section.

5 Discussion and Future Work

In this paper, we provide experiment materials for expert finding with the help of four annotated datasets and further report results based on several baseline algorithms. Moreover, we study the ability of document network embedding methods to tackle the expert finding challenge. We show that DNE algorithms can not be trivially adapted to achieve state-of-the-art scores. However, we reveal that document network embeddings can improve the voting model but diminish the propagation model.

In future work, we would like to find an efficient way to bridge the gap between DNE algorithms and expert finding. To do so, taking the heterogeneity into account should help better capturing the real similarity between a document and a candidate. Furthermore, a deeper analysis of the interplay between the candidates and the text content of the documents appears to be a necessary way to better understand the task of expert finding.

AUC P@10 AP
random 49.47 (09.80) 05.00 (06.66) 07.09 (03.81)
panoptic (tf-idf) 74.06(12.94) 22.37 (16.35) 23.24 (12.55)
voting (tf-idf) 78.60 (11.97) 26.05 (15.76) 28.24 (13.92)
propagation (tf-idf) 79.26 (13.09) 33.07 (19.61) 34.66 (18.21)
pre-agg TADW 65.84 (12.94) 15.61 (11.63) 17.26 (08.78)
pre-agg GVNR-t 76.90 (11.46) 19.04 (11.70) 21.39 (09.61)
pre-agg G2G 72.87 (12.75) 15.70 (11.62) 18.53 (09.37)
pre-agg IDNE 78.08 (11.27) 20.18 (11.85) 22.00 (09.87)
post-agg TADW 68.01 (13.37) 16.32 (11.57) 18.01 (08.97
post-agg GVNR-t 73.91 (13.93) 18.86 (12.19) 20.57 (10.33)
post-agg G2G 68.94 (15.23) 16.23 (12.02) 18.21 (09.76)
post-agg IDNE 76.87 (13.36) 19.04 (14.57) 21.57 (10.96)
voting (IDNE) 82.23 (11.08) 34.82 (18.46) 37.27 (16.16)
propagation (IDNE) 82.44 (16.14) 44.47 (22.91) 47.01 (22.06)
Table 2: Mean scores with their standard deviations on DBLP
AUC P@10 AP
random 50.01 (02.24) 04.52 (07.02) 04.96 (02.81)
panoptic (tf-idf) 79.47 (06.22) 13.45 (13.39) 15.22 (05.62)
voting (tf-idf) 84.96 (05.22) 52.53 (16.13) 31.01 (06.58)
propagation (tf-idf) 86.33 (05.64) 91.53 (13.44) 44.09 (07.70
pre-agg TADW 63.07 (07.70) 11.42 (12.34) 08.45 (03.87)
pre-agg GVNR-t 70.67 (09.49) 21.12 (20.99) 12.43 (07.30
pre-agg G2G 63.63 (07.62) 12.93 (12.06) 07.81 (04.15
pre-agg IDNE 65.07 (09.05) 13.37 (13.48) 09.40 (05.19)
post-agg TADW 68.74 (07.02) 13.67 (12.59) 09.99 (04.37)
post-agg GVNR-t 66.56 (08.61) 22.47 (15.92) 10.75 (05.42
post-agg G2G 62.53 (07.44) 11.95 (11.86) 07.48 (04.13)
post-agg IDNE 65.63 (08.57) 13.34 (13.13) 09.38 (04.94
voting (IDNE) 86.94 (04.91) 53.91 (18.06) 32.18 (08.33)
propagation (IDNE) 67.62 (10.11) 90.43 (15.20) 33.07 (08.93)
Table 3: Mean scores with standard deviations on Stats
AUC P@10 AP
random 50.02 (01.78) 05.93 (08.07) 06.09 (02.72)
panoptic (tf-idf) 81.54 (04.36) 18.35 (18.76) 22.93 (07.14)
voting (tf-idf) 85.88 (03.47) 57.99 (15.87) 37.66 (05.83
propagation (tf-idf) 88.02 (03.32) 99.01 (03.57) 54.04 (05.44)
pre-agg TADW 61.47 (06.16) 11.09 (12.04) 09.29 (03.53)
pre-agg GVNR-t 64.22 (09.69) 25.67 (23.27) 13.07 (07.54)
pre-agg G2G 61.54 (05.38) 14.30 (12.91) 08.74 (03.69)
pre-agg IDNE 58.74 (07.49) 10.21 (11.58) 08.41 (03.99
post-agg TADW 71.94 (04.63) 14.44 (12.87) 12.68 (04.37
post-agg GVNR-t 61.22 (06.24) 20.70 (14.59) 10.19 (04.21)
post-agg G2G 58.87 (05.79) 12.80 (12.06) 08.12 (03.67)
post-agg IDNE 59.97 (07.40) 10.61 (11.19) 08.76 (04.17))
voting (IDNE) 86.79 (03.90) 55.81 (17.35) 37.13 (07.58)
propagation (IDNE) 61.35 (08.56) 95.02 (10.15) 31.27 (08.21)
Table 4: Mean scores with standard deviations on Academia
AUC P@10 AP
random 49.98 (01.62) 06.44 (08.28) 06.53 (03.06)
panoptic (tf-idf) 81.87 (04.46) 21.95 (19.15) 22.95 (07.54)
voting (tf-idf) 86.80 (03.23) 61.11 (18.68) 40.10 (08.27)
propagation (tf-idf) 88.08 (03.38) 93.68 (12.16) 49.58 (08.90)
pre-agg TADW NA NA NA
pre-agg GVNR-t 65.34 (09.22) 44.02 (28.31) 16.88 (08.55)
pre-agg G2G 66.84 (08.99) 22.95 (17.81) 12.49 (05.70)
pre-agg IDNE 67.01 (09.26) 22.96 (17.84) 13.40 (06.02)
post-agg TADW NA NA NA
post-agg GVNR-t 63.84 (07.59) 41.81 (22.68) 14.96 (06.25)
post-agg G2G 65.06 (09.09) 22.43 (16.94) 11.78 (05.51)
post-agg IDNE 66.74 (09.10) 21.92 (17.21) 13.11 (05.87)
voting (IDNE) 88.71 (03.76) 68.46 (18.53) 43.53 (09.90)
propagation (IDNE) 69.38 (09.65) 92.35 (13.88) 39.62 (09.89)
Table 5: Mean scores with standard deviations on Mathoverflow

References

  • [1] K. Balog, P. Serdyukov, and A. P. De Vries (2010) Overview of the trec 2010 entity track. Technical report NORWEGIAN UNIV OF SCIENCE AND TECHNOLOGY TRONDHEIM. Cited by: §2.1, §2.2.
  • [2] A. Bojchevski and S. Günnemann (2018) Deep gaussian embedding of graphs: unsupervised inductive learning via ranking. In International Conference on Learning Representations, pp. 1–13. Cited by: §2.3, 3rd item.
  • [3] R. Brochier, A. Guille, B. Rothan, and J. Velcin (2018) Impact of the query set on the evaluation of expert finding systems.

    Proceedings of the 3rd Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2018) co-located with the 41st International ACM SIGIR Conference

    .
    Cited by: §2.1, §3.1, §3.3.1.
  • [4] R. Brochier, A. Guille, and J. Velcin (2019) Global vectors for node representations. In The World Wide Web Conference, pp. 2587–2593. Cited by: §2.3, 2nd item.
  • [5] R. Brochier, A. Guille, and J. Velcin (2020) Inductive document network embedding with topic-word attention. In Proceedings of the 42nd European Conference on Information Retrieval Research, Cited by: §2.3, 4th item.
  • [6] P. Cifariello, P. Ferragina, and M. Ponza (2019) WISER: a semantic approach for expert finding in academia based on entity linking. Inf. Syst. 82, pp. 1–16. Cited by: §2.2.
  • [7] N. Craswell, A. P. de Vries, and I. Soboroff (2005) Overview of the trec 2005 enterprise track.. In Trec, Vol. 5, pp. 1–7. Cited by: §2.1.
  • [8] N. Craswell, D. Hawking, A. Vercoustre, and P. Wilkins (2001) P@noptic expert: searching for experts not just for documents. In Ausweb Poster Proceedings, Queensland, Australia, Vol. 15, pp. 17. Cited by: §2.2, 2nd item.
  • [9] A. Dargahi Nobari, S. Sotudeh Gharebagh, and M. Neshati (2017) Skill translation models in expert finding. In Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval, pp. 1057–1060. Cited by: §2.2.
  • [10] S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. Harshman (1990) Indexing by latent semantic analysis. Journal of the American Society for Information Science 41 (6), pp. 391–407. Cited by: §2.3.
  • [11] T. Fischer, S. Remus, and C. Biemann (2019) LT expertfinder: an evaluation framework for expert finding methods. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations), pp. 98–104. Cited by: §2.2.
  • [12] A. Grover and J. Leskovec (2016) Node2vec: scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 855–864. Cited by: §2.3.
  • [13] O. Levy and Y. Goldberg (2014) Neural word embedding as implicit matrix factorization. In Advances in neural information processing systems, pp. 2177–2185. Cited by: §2.3.
  • [14] C. Macdonald, I. Ounis, and I. Soboroff (2007) Overview of the trec 2007 blog track.. In TREC, Vol. 7, pp. 31–43. Cited by: §2.1.
  • [15] C. Macdonald and I. Ounis (2006) Voting for candidates: adapting data fusion techniques for an expert search task. In Proceedings of the 15th ACM international conference on Information and knowledge management, pp. 387–396. Cited by: §2.2, 3rd item.
  • [16] R. J. Mislevy and M. M. Riconscente (2011) Evidence-centered assessment design. In Handbook of test development, pp. 75–104. Cited by: §2.1.
  • [17] L. Page, S. Brin, R. Motwani, and T. Winograd (1999) The pagerank citation ranking: bringing order to the web.. Technical report Stanford InfoLab. Cited by: §2.2.
  • [18] J. Pennington, R. Socher, and C. Manning (2014) Glove: global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp. 1532–1543. Cited by: §2.3.
  • [19] B. Perozzi, R. Al-Rfou, and S. Skiena (2014) Deepwalk: online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 701–710. Cited by: §2.3.
  • [20] F. Riahi, Z. Zolaktaf, M. Shafiei, and E. Milios (2012) Finding expert users in community question answering. In Proceedings of the 21st International Conference on World Wide Web, pp. 791–798. Cited by: §2.1.
  • [21] P. Serdyukov, H. Rode, and D. Hiemstra (2008) Modeling multi-step relevance propagation for expert finding. In Proceedings of the 17th ACM conference on Information and knowledge management, pp. 1133–1142. Cited by: §2.2, 4th item.
  • [22] J. Tang, J. Zhang, L. Yao, J. Li, L. Zhang, and Z. Su (2008) Arnetminer: extraction and mining of academic social networks. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 990–998. Cited by: §3.2.
  • [23] C. Van Gysel, M. de Rijke, and M. Worring (2016) Unsupervised, efficient and semantic expertise retrieval. In WWW, Vol. 2016, pp. 1069–1079. Cited by: §2.2.
  • [24] C. Yang, Z. Liu, D. Zhao, M. Sun, and E. Chang (2015) Network representation learning with rich text information. In

    Twenty-Fourth International Joint Conference on Artificial Intelligence

    ,
    Cited by: §2.3, 1st item.
  • [25] S. Yuan, Y. Zhang, J. Tang, W. Hall, and J. B. Cabotà (2020) Expert finding in community question answering: a review. Artificial Intelligence Review 53 (2), pp. 843–874. Cited by: §2.1.
  • [26] J. Zhang, J. Tang, and J. Li (2007) Expert finding in a social network. In International Conference on Database Systems for Advanced Applications, pp. 1066–1069. Cited by: §2.1, §3.2.
  • [27] M. Zhang, R. Song, C. Lin, S. Ma, Z. Jiang, Y. Jin, Y. Liu, L. Zhao, and S. Ma (2003) Expansion-based technologies in finding relevant and new information: thu trec 2002: novelty track experiments. NIST SPECIAL PUBLICATION SP 251, pp. 586–590. Cited by: §2.2.
  • [28] Z. Zhao, L. Zhang, X. He, and W. Ng (2014) Expert finding for question answering via graph regularized matrix completion. IEEE Transactions on Knowledge and Data Engineering 27 (4), pp. 993–1004. Cited by: §2.1.