A Method for Discovering and Extracting Author Contributions Information from Scientific Biomedical Publications

02/04/2018
by   Dominika Tkaczyk, et al.
0

Creating scientific publications is a complex process, typically composed of a number of different activities, such as designing the experiments, data preparation, programming software and writing and editing the manuscript. The information about the contributions of individual authors of a paper is important in the context of assessing authors' scientific achievements. Some publications in biomedical disciplines contain a description of authors' roles in the form of a short section written in natural language, typically entitled "Authors' contributions". In this paper, we present an analysis of roles commonly appearing in the content of these sections, and propose an algorithm for automatic extraction of authors' roles from natural language text in scientific publications. During the first part of the study, we used clustering techniques, as well as Open Information Extraction (OpenIE), to semi-automatically discover the most popular roles within a corpus of 2,000 contributions sections obtained from PubMed Central resources. The roles discovered by our approach include: experimenting (1,743 instances, 17 entire role set within the corpus), analysis (1,343, 16 13 (823, 10 (351, 4 0.5 automatically build a training set for the supervised role extractor, based on Naive Bayes algorithm. According to the evaluation we performed, the proposed role extraction algorithm is able to extract the roles from the text with precision 0.71, recall 0.49 and F1 0.58.

READ FULL TEXT

page 5

page 6

page 7

research
12/15/2019

NaïveRole: Author-Contribution Extraction and Parsing from Biomedical Manuscripts

Information about the contributions of individual authors to scientific ...
research
11/15/2017

Understanding the Changing Roles of Scientific Publications via Citation Embeddings

Researchers may describe different aspects of past scientific publicatio...
research
12/08/2022

One for all and all for one: on the role of a conference in a scientist's life

The quantitative description of the scientific conference MECO (Middle E...
research
11/04/2022

SMAuC – The Scientific Multi-Authorship Corpus

The rapidly growing volume of scientific publications offers an interest...
research
12/19/2018

Assessing technical and cost efficiency of research activities: A case study of the Italian university system

This paper employs data envelopment analysis (DEA) to assess both techni...
research
08/31/2015

Ethnicity sensitive author disambiguation using semi-supervised learning

Author name disambiguation in bibliographic databases is the problem of ...

Please sign up or login with your details

Forgot password? Click here to reset