Content-based subject classification at article level in biomedical context

04/30/2021
by   Eric Jeangirard, et al.
0

Subject classification is an important task to analyze scholarly publications. In general, mainly two kinds of approaches are used: classification at a journal level and classification at the article level. We propose a mixed approach, leveraging on embeddings technique in NLP to train classifiers with article metadata (title, abstract, keywords in particular) labelled with the journal-level classification FoR (Fields of Research) and then apply these classifiers at the article level. We use this approach in the context of biomedical publications using metadata from Pubmed. Fasttext classifiers are trained with FoR codes and used to classify publications based on their available metadata. Results show that using a stratification sampling strategy for training help reduce the bias due to unbalanced field distribution. An implementation of the method is proposed on the repository https://github.com/dataesr/scientific_tagger

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/14/2021

Monitoring Open Access at a national level: French case study

After the launch of multiple plans for Open Science, there is now a need...
research
03/19/2019

Aligning Biomedical Metadata with Ontologies Using Clustering and Embeddings

The metadata about scientific experiments published in online repositori...
research
06/04/2021

MexPub: Deep Transfer Learning for Metadata Extraction from German Publications

Extracting metadata from scientific papers can be considered a solved pr...
research
01/03/2018

The Unified Astronomy Thesaurus: Semantic Metadata for Astronomy and Astrophysics

Several different controlled vocabularies have been developed and used b...
research
01/19/2023

LaTeX, metadata, and publishing workflows

The field of scientific publishing that is served by LaTeX is increasing...
research
10/05/2021

Using Elasticsearch for entity recognition in affiliation disambiguation

Automatic recognition of affiliations in the metadata of scholarly publi...
research
04/17/2020

Algorithmic labeling in hierarchical classifications of publications: Evaluation of bibliographic fields and term weighting approaches

Algorithmic classifications of research publications can be used to stud...

Please sign up or login with your details

Forgot password? Click here to reset