Report on the 8th International Workshop on Bibliometric-enhanced Information Retrieval (BIR 2019)

09/11/2019 ∙ by Guillaume Cabanac, et al. ∙ Association for Computing Machinery Salle du Cap 0

The Bibliometric-enhanced Information Retrieval workshop series (BIR) at ECIR tackled issues related to academic search, at the crossroads between Information Retrieval and Bibliometrics. BIR is a hot topic investigated by both academia (e.g., ArnetMiner, CiteSeerx, DocEar) and the industry (e.g., Google Scholar, Microsoft Academic Search, Semantic Scholar). This report presents the 8th iteration of the one-day BIR workshop held at ECIR 2019 in Cologne, Germany.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Searching for scientific information is a long-lived information need. In the early 1960s, Salton was already striving to enhance information retrieval by including clues inferred from bibliographic citations [31]. The development of citation indexes pioneered by Garfield [11] proved determinant for such a research endeavour at the crossroads between the nascent fields of Information Retrieval (IR) and Bibliometrics111Bibliometrics refers to the statistical analysis of the academic literature [29] and plays a key role in scientometrics: the quantitative study of science and innovation [17].. The pioneers who established these fields in Information Science—such as Salton and Garfield—were followed by scientists who specialised in one of these [39], leading to the two loosely connected fields we know of today.

The purpose of the BIR workshop series founded in 2014 is to tighten up the link between IR and Bibliometrics. We strive to get the ‘retrievalists’ and ‘citationists’ [39] active in both academia and the industry together, who are developing search engines and recommender systems such as ArnetMiner [36], CiteSeer [41], DocEar [3], Google Scholar [38], Microsoft Academic Search [35], and Semantic Scholar [4], just to name a few.

Bibliometric-enhanced IR systems must deal with the multifaceted nature of scientific information by searching for or recommending academic papers, patents [12], venues (i.e., conferences or journals), authors, experts (e.g., peer reviewers), references (to be cited to support an argument), and datasets. The underlying models harness relevance signals from keywords provided by authors, topics extracted from the full-texts, coauthorship networks, citation networks, and various classifications schemes of science.

Bibliometric-enhanced IR is a hot topic whose recent developments made the news—see for instance the Initiative for Open Citations [33] and the Google Dataset Search [8] launched on September 4, 2018, which give an impression of arising challenges subject to both communities. We believe that BIR@ECIR is a much needed scientific event for the ‘retrievalists’ and ‘citationists’ to meet and join forces pushing the knowledge boundaries of IR applied to literature search and recommendation.

2 Past Related Activities

The BIR workshop series was launched at ECIR in 2014 [26] and it was held at ECIR each year since then [25, 22, 23, 24]. As our workshop has been lying at the crossroads between IR and NLP, we also ran it as a joint workshop called BIRNDL (for Bibliometric-enhanced IR and NLP for Digital Libraries) at the JCDL [5] and SIGIR [19, 20] conferences. All workshops had a large number of participants, demonstrating the relevance of the workshop’s topics. The BIR and BIRNDL workshop series gave the community the opportunity to discuss latest developments and shared tasks such as the CL-SciSumm [14], which was introduced at the BIRNDL joint workshop.

The authors of the most promising workshop papers were offered the opportunity to submit an extended version for a Special Issue for the Scientometrics journal [27, 7] and of the International Journal on Digital Libraries [21].

The target audience of our workshop are researchers and practitioners, junior and senior, from Scientometrics as well as Information Retrieval. These could be IR researchers interested in potential new application areas for their work as well as researchers and practitioners working with, for instance, bibliometric data and interested in how IR methods can make use of such data.

3 Objectives and Topics for BIR@ECIR 2019

We called for original research at the crossroads of IR and Bibliometrics. Thirteen peer-reviewed papers were accepted222See workshop proceedings: http://ceur-ws.org/Vol-2345/. [6]: 9 long papers, 3 short papers and 1 demo paper. These report on new approaches using bibliometric clues to enhance the search or recommendation of scientific information or significant improvements of existing techniques. Thorough quantitative studies of the various corpora to be indexed (papers, patents, networks or else) were also contributed. The papers are as follows:

  • Long papers:

    • An interactive visual tool for scientific literature search: Proposal and algorithmic specification [2]

    • A searchable space with routes for querying scientific information [9]

    • Discovering seminal works with marker papers [13]

    • How do computer scientists use Google Scholar?: A survey of user interest in elements on SERPs and author profile pages [15]

    • Feature selection and graph representation for an analysis of science fields evolution: An application to the digital library ISTEX [16]

    • Optimal citation con-text window sizes for biomedical retrieval [18]

    • Bibliometric-enhanced arXiv: A data set for paper-based and citation-based tasks [30]

    • Mining intellectual influence associations [32]

    • Citation metrics for legal information retrieval systems [40]

  • Short papers:

    • Finding temporal trends of scientific concepts [10]

    • A preliminary study to compare deep learning with rule-based approaches for citation classification 

      [28]

    • Improving scientific article visibility by neural title simplification [34]

  • Demo:

    • Recommending multimedia educational resources on the MOVING platform [37].

The topics of the workshop are in line with those of the past BIR and BIRNDL workshops (Fig. 1): a mixture of IR and Bibliometric concepts and techniques. More specifically, the call for papers featured current research issues regarding three aspects of the search/recommendation process:

  1. User needs and behaviour regarding scientific information, such as:

    • Finding relevant papers/authors for a literature review;

    • Measuring the degree of plagiarism in a paper;

    • Identifying expert reviewers for a given submission;

    • Flagging predatory conferences and journals.

  2. The characteristics of scientific information:

    • Measuring the reliability of bibliographic libraries;

    • Spotting research trends and research fronts.

  3. Academic search/recommendation systems:

    • Modelling the multifaceted nature of scientific information;

    • Building test collections for reproducible BIR.

Figure 1: Main topics of the BIR and BIRNDL workshop series (2014–2018) as extracted from the titles of the papers published in the proceedings, see https://dblp.org/search?q=BIR.ECIR and https://dblp.org/search?q=BIRNDL.

4 Peer Review Process and Organization

The 8th BIR edition ran as a one-day workshop, as it was the case for the previous editions. Dr. Iana Atanassova delivered a keynote entitled “Beyond Metadata: the New Challenges in Mining Scientific Papers” [1] to kick off the day.

Two types of papers were presented: long papers (15-minute talks) and short papers (5-minute talks). As the interactive session introduced last year was generally acclaimed, we decided to organize a interactive session to close the workshop. Two weeks earlier, we invited all registered attendees to demonstrate their prototypes or pitch a poster during flash presentations (5 minutes). This was an opportunity for our speakers to further discuss their work and for the public to showcase their work too.

We ran the workshop with peer review supported by EasyChair333https://easychair.org. Each submission was assigned to 2 to 3 reviewers, preferably at least one expert in IR and one expert in Bibliometrics. The stronger submissions were accepted as long papers while weaker ones were accepted as short papers, and demo. All authors were instructed to revise their submission according to the reviewers’ reports. All accepted papers were included in the workshop proceedings [6] hosted at ceur-ws.org, an established open access repository with no author-processing charges.

As a follow-up of the workshop, all authors are encouraged to submit an extended version of their papers to the Special Issue of the Scientometrics journal launched in Spring 2019.

References

  • [1] I. Atanassova (2019) Beyond metadata: the new challenges in mining scientific papers. In Proc. of the 8th Workshop on Bibliometric-enhanced Information Retrieval, pp. 8–13. Cited by: §4.
  • [2] J. P. Bascur, N. J. van Eck, and L. Waltman (2019) An interactive visual tool for scientific literature search: Proposal and algorithmic specification. In Proc. of the 8th Workshop on Bibliometric-enhanced Information Retrieval, pp. 76–87. Cited by: 1st item.
  • [3] J. Beel, S. Langer, B. Gipp, and A. Nürnberger (2014) The architecture and datasets of docear’s research paper recommender system. D-Lib Magazine 20 (11/12). External Links: Document Cited by: §1.
  • [4] J. Bohannon (2016) A computer program just ranked the most influential brain scientists of the modern era. Science. External Links: Document Cited by: §1.
  • [5] G. Cabanac, M. K. Chandrasekaran, I. Frommholz, K. Jaidka, M. Kan, P. Mayr, and D. Wolfram (Eds.) (2016)

    BIRNDL’16: Proceedings of the Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries co-located with the Joint Conference on Digital Libraries

    .
    Vol. 1610, CEUR-WS, Aachen. Cited by: §2.
  • [6] G. Cabanac, I. Frommholz, and P. Mayr (Eds.) (2019) BIR’19 Proceedings of the 8th Workshop on Bibliometric-enhanced Information Retrieval co-located with the 41th European Conference on Information Retrieval. Vol. 2345, CEUR-WS, Aachen. Cited by: §3, §4.
  • [7] G. Cabanac, P. Mayr, and I. Frommholz (2018) Bibliometric-enhanced information retrieval: Preface. Scientometrics 116 (2), pp. 1225–1227. External Links: Document Cited by: §2.
  • [8] D. Castelvecchi (2018) Google unveils search engine for open data [News & Comment]. Nature. External Links: Document Cited by: §1.
  • [9] R. Fabre (2019) A “searchable” space with routes for querying scientific information. In Proc. of the 8th Workshop on Bibliometric-enhanced Information Retrieval, pp. 112–124. Cited by: 2nd item.
  • [10] M. Färber and A. Jatowt (2019) Finding temporal trends of scientific concepts. In Proc. of the 8th Workshop on Bibliometric-enhanced Information Retrieval, pp. 132–139. Cited by: 1st item.
  • [11] E. Garfield (1955) Citation indexes for science: A new dimension in documentation through association of ideas. Science 122 (3159), pp. 108–111. External Links: Document Cited by: §1.
  • [12] E. Garfield (1966) Patent citation indexing and the notions of novelty, similarity, and relevance. Journal of Chemical Documentation 6 (2), pp. 63–65. External Links: Document Cited by: §1.
  • [13] R. Haunschild and W. Marx (2019) Discovering seminal works with marker papers. In Proc. of the 8th Workshop on Bibliometric-enhanced Information Retrieval, pp. 27–38. Cited by: 3rd item.
  • [14] K. Jaidka, M. K. Chandrasekaran, S. Rustagi, and M. Kan (2018)

    Insights from CL-SciSumm 2016: The faceted scientific document summarization shared task

    .
    International Journal on Digital Libraries 19 (2–3), pp. 163–171. External Links: Document Cited by: §2.
  • [15] J. Kim, J. R. Trippas, M. Sanderson, Z. Bao, and W. B. Croft (2019) How do computer scientists use Google Scholar?: A survey of user interest in elements on SERPs and author profile pages. In Proc. of the 8th Workshop on Bibliometric-enhanced Information Retrieval, pp. 64–75. Cited by: 4th item.
  • [16] J. Lamirel and P. Cuxac (2019) Feature selection and graph representation for an analysis of science fields evolution: An application to the digital library ISTEX. In Proc. of the 8th Workshop on Bibliometric-enhanced Information Retrieval, pp. 88–99. Cited by: 5th item.
  • [17] L. Leydesdorff and S. Milojević (2015) Scientometrics. In International Encyclopedia of the Social & Behavioral Sciences, J. D. Wright (Ed.), Vol. 21, pp. 322–327. External Links: Document Cited by: footnote 1.
  • [18] B. Lykke Nielsen, S. Lavlund Skau, F. Meier, and B. Larsen (2019) Optimal citation context window sizes for biomedical retrieval. In Proc. of the 8th Workshop on Bibliometric-enhanced Information Retrieval, pp. 51–63. Cited by: 6th item.
  • [19] P. Mayr, M. K. Chandrasekaran, and K. Jaidka (Eds.) (2017) BIRNDL’17: Proceedings of the 2nd Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries co-located with the Joint Conference on Digital Libraries. Vol. 1888, CEUR-WS, Aachen. Cited by: §2.
  • [20] P. Mayr, M. K. Chandrasekaran, and K. Jaidka (Eds.) (2018) BIRNDL’17: Proceedings of the 3rd Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries co-located with the Joint Conference on Digital Libraries. Vol. 2132, CEUR-WS, Aachen. Cited by: §2.
  • [21] P. Mayr, I. Frommholz, G. Cabanac, M. K. Chandrasekaran, K. Jaidka, M. Kan, and D. Wolfram (2018) Special issue on bibliometric-enhanced information retrieval and natural language processing for digital libraries. International Journal on Digital Libraries 19 (2–3), pp. 107–111. External Links: Document Cited by: §2.
  • [22] P. Mayr, I. Frommholz, and G. Cabanac (Eds.) (2016) BIR’16 Proceedings of the 3rd Workshop on Bibliometric-enhanced Information Retrieval co-located with the 38th European Conference on Information Retrieval. Vol. 1567, CEUR-WS, Aachen. Cited by: §2.
  • [23] P. Mayr, I. Frommholz, and G. Cabanac (Eds.) (2017) BIR’17 Proceedings of the 5th Workshop on Bibliometric-enhanced Information Retrieval co-located with the 39th European Conference on Information Retrieval. Vol. 1823, CEUR-WS, Aachen. Cited by: §2.
  • [24] P. Mayr, I. Frommholz, and G. Cabanac (Eds.) (2018) BIR’18 Proceedings of the 7th Workshop on Bibliometric-enhanced Information Retrieval co-located with the 40th European Conference on Information Retrieval. Vol. 2080, CEUR-WS. Cited by: §2.
  • [25] P. Mayr, I. Frommholz, and P. Mutschke (Eds.) (2015) BIR’15 Proceedings of the 2nd Workshop on Bibliometric-enhanced Information Retrieval co-located with the 37th European Conference on Information Retrieval. Vol. 1344, CEUR-WS, Aachen. Cited by: §2.
  • [26] P. Mayr, P. Schaer, A. Scharnhorst, B. Larsen, and P. Mutschke (Eds.) (2014) BIR’16 Proceedings of the 1st Workshop on Bibliometric-enhanced Information Retrieval co-located with the 36th European Conference on Information Retrieval. Vol. 1143, CEUR-WS, Aachen. Cited by: §2.
  • [27] P. Mayr and A. Scharnhorst (2015) Scientometrics and information retrieval: weak-links revitalized. Scientometrics 102 (3), pp. 2193–2199. External Links: Document Cited by: §2.
  • [28] J. Perier-Camby, M. Bertin, I. Atanassova, and F. Armetta (2019) A preliminary study to compare deep learning with rule-based approaches for citation classification. In Proc. of the 8th Workshop on Bibliometric-enhanced Information Retrieval, pp. 125–131. Cited by: 2nd item.
  • [29] A. Pritchard (1969) Statistical bibliography or bibliometrics? [Documentation notes]. Journal of Documentation 25 (4), pp. 348–349. External Links: Document Cited by: footnote 1.
  • [30] T. Saier and M. Färber (2019) Bibliometric-enhanced arXiv: A data set for paper-based and citation-based tasks. In Proc. of the 8th Workshop on Bibliometric-enhanced Information Retrieval, pp. 14–26. Cited by: 7th item.
  • [31] G. Salton (1963) Associative document retrieval techniques using bibliographic information. Journal of the ACM 10 (4), pp. 440–457. External Links: Document Cited by: §1.
  • [32] T. Shah and V. Pudi (2019) Mining intellectual influence associations. In Proc. of the 8th Workshop on Bibliometric-enhanced Information Retrieval, pp. 100–111. Cited by: 8th item.
  • [33] D. Shotton (2018) Funders should mandate open citations. Nature 553 (7687), pp. 129. External Links: Document Cited by: §1.
  • [34] A. Shvets (2019) Improving scientific article visibility by neural title simplification. In Proc. of the 8th Workshop on Bibliometric-enhanced Information Retrieval, pp. 140–147. Cited by: 3rd item.
  • [35] A. Sinha, Z. Shen, Y. Song, H. Ma, D. Eide, B. (. Hsu, and K. Wang (2015) An overview of Microsoft Academic Service (MAS) and applications. In WWW’15: Proceedings of the 24th International Conference on World Wide Web, A. Gangemi, S. Leonardi, and A. Panconesi (Eds.), New York, NY, USA, pp. 243–246. External Links: Document Cited by: §1.
  • [36] J. Tang, J. Zhang, L. Yao, J. Li, L. Zhang, and Z. Su (2008) ArnetMiner: Extraction and mining of academic social networks. In KDD’08: Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, New York, NY, USA, pp. 990–998. External Links: ISBN 978-1-60558-193-4, Document Cited by: §1.
  • [37] I. Vagliano and S. Nazir (2019) Recommending multimedia educational resources on the MOVING platform. In Proc. of the 8th Workshop on Bibliometric-enhanced Information Retrieval, pp. 148–158. Cited by: 1st item.
  • [38] R. Van Noorden (2014) Google Scholar pioneer on search engine’s future. Nature. External Links: Document Cited by: §1.
  • [39] H. D. White and K. W. McCain (1998) Visualizing a discipline: An author co-citation analysis of Information Science, 1972–1995. Journal of the American Society for Information Science 49 (4), pp. 327–355. External Links: Document Cited by: §1, §1.
  • [40] G. Wiggers and S. Verberne (2019) Citation metrics for legal information retrieval systems. In Proc. of the 8th Workshop on Bibliometric-enhanced Information Retrieval, pp. 39–50. Cited by: 9th item.
  • [41] K. Williams, J. Wu, S. R. Choudhury, M. Khabsa, and C. L. Giles (2014) Scholarly big data information extraction and integration in the CiteSeer digital library. In ICDE’14: Proceedings of the 30th IEEE International Conference on Data Engineering Workshops, pp. 68–73. External Links: Document Cited by: §1.