LitMC-BERT: transformer-based multi-label classification of biomedical literature with an application on COVID-19 literature curation

04/19/2022
by   Qingyu Chen, et al.
16

The rapid growth of biomedical literature poses a significant challenge for curation and interpretation. This has become more evident during the COVID-19 pandemic. LitCovid, a literature database of COVID-19 related papers in PubMed, has accumulated over 180,000 articles with millions of accesses. Approximately 10,000 new articles are added to LitCovid every month. A main curation task in LitCovid is topic annotation where an article is assigned with up to eight topics, e.g., Treatment and Diagnosis. The annotated topics have been widely used both in LitCovid (e.g., accounting for  18 studies such as network generation. However, it has been a primary curation bottleneck due to the nature of the task and the rapid literature growth. This study proposes LITMC-BERT, a transformer-based multi-label classification method in biomedical literature. It uses a shared transformer backbone for all the labels while also captures label-specific features and the correlations between label pairs. We compare LITMC-BERT with three baseline models on two datasets. Its micro-F1 and instance-based F1 are 5 current best results, respectively, and only requires  18 time than the Binary BERT baseline. The related datasets and models are available via https://github.com/ncbi/ml-transformer.

READ FULL TEXT

page 1

page 10

page 11

research
04/14/2022

Multi-label topic classification for COVID-19 literature with Bioformer

We describe Bioformer team's participation in the multi-label topic clas...
research
11/10/2021

BagBERT: BERT-based bagging-stacking for multi-topic classification

This paper describes our submission on the COVID-19 literature annotatio...
research
06/25/2021

Domain-Specific Pretraining for Vertical Search: Case Study on Biomedical Literature

Information overload is a prevalent challenge in many high-value domains...
research
06/15/2020

Document Classification for COVID-19 Literature

The global pandemic has made it more important than ever to quickly and ...
research
05/11/2020

On the Generation of Medical Dialogues for COVID-19

Under the pandemic of COVID-19, people experiencing COVID19-related symp...
research
05/12/2021

Priberam at MESINESP Multi-label Classification of Medical Texts Task

Medical articles provide current state of the art treatments and diagnos...

Please sign up or login with your details

Forgot password? Click here to reset