Content-based subject classification at article level in biomedical context

04/30/2021
by   Eric Jeangirard, et al.
0

Subject classification is an important task to analyze scholarly publications. In general, mainly two kinds of approaches are used: classification at a journal level and classification at the article level. We propose a mixed approach, leveraging on embeddings technique in NLP to train classifiers with article metadata (title, abstract, keywords in particular) labelled with the journal-level classification FoR (Fields of Research) and then apply these classifiers at the article level. We use this approach in the context of biomedical publications using metadata from Pubmed. Fasttext classifiers are trained with FoR codes and used to classify publications based on their available metadata. Results show that using a stratification sampling strategy for training help reduce the bias due to unbalanced field distribution. An implementation of the method is proposed on the repository https://github.com/dataesr/scientific_tagger

READ FULL TEXT

Authors

page 1

page 2

page 3

page 4

04/14/2021

Monitoring Open Access at a national level: French case study

After the launch of multiple plans for Open Science, there is now a need...
03/19/2019

Aligning Biomedical Metadata with Ontologies Using Clustering and Embeddings

The metadata about scientific experiments published in online repositori...
06/04/2021

MexPub: Deep Transfer Learning for Metadata Extraction from German Publications

Extracting metadata from scientific papers can be considered a solved pr...
01/03/2018

The Unified Astronomy Thesaurus: Semantic Metadata for Astronomy and Astrophysics

Several different controlled vocabularies have been developed and used b...
08/03/2017

Metadata in the BioSample Online Repository are Impaired by Numerous Anomalies

The metadata about scientific experiments are crucial for finding, repro...
10/05/2021

Using Elasticsearch for entity recognition in affiliation disambiguation

Automatic recognition of affiliations in the metadata of scholarly publi...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.