Navigating the landscape of COVID-19 research through literature analysis: A bird's eye view

08/07/2020
by   Lana Yeganova, et al.
0

Timely access to accurate scientific literature in the battle with the ongoing COVID-19 pandemic is critical. This unprecedented public health risk has motivated research towards understanding the disease in general, identifying drugs to treat the disease, developing potential vaccines, etc. This has given rise to a rapidly growing body of literature that doubles in number of publications every 20 days as of May 2020. Providing medical professionals with means to quickly analyze the literature and discover growing areas of knowledge is necessary for addressing their question and information needs. In this study we analyze the LitCovid collection, 13,369 COVID-19 related articles found in PubMed as of May 15th, 2020 with the purpose of examining the landscape of literature and presenting it in a format that facilitates information navigation and understanding. We do that by applying state-of-the-art named entity recognition, classification, clustering and other NLP techniques. By applying NER tools, we capture relevant bioentities (such as diseases, internal body organs, etc.) and assess the strength of their relationship with COVID-19 by the extent they are discussed in the corpus. We also collect a variety of symptoms and co-morbidities discussed in reference to COVID-19. Our clustering algorithm identifies topics represented by groups of related terms, and computes clusters corresponding to documents associated with the topic terms. Among the topics we observe several that persist through the duration of multiple weeks and have numerous associated documents, as well several that appear as emerging topics with fewer documents. All the tools and data are publicly available, and this framework can be applied to any literature collection. Taken together, these analyses produce a comprehensive, synthesized view of COVID-19 research to facilitate knowledge discovery from literature.

READ FULL TEXT
research
04/16/2023

EasyNER: A Customizable Easy-to-Use Pipeline for Deep Learning- and Dictionary-based Named Entity Recognition from Medical Text

Medical research generates a large number of publications with the PubMe...
research
12/04/2019

PDC – a probabilistic distributional clustering algorithm: a case study on suicide articles in PubMed

The need to organize a large collection in a manner that facilitates hum...
research
09/17/2020

A Glimpse of the First Eight Months of the COVID-19 Literature on Microsoft Academic Graph: Themes, Citation Contexts, and Uncertainties

As scientists worldwide search for answers to the overwhelmingly unknown...
research
06/05/2023

Literature-based Discovery for Landscape Planning

This project demonstrates how medical corpus hypothesis generation, a kn...
research
12/02/2021

LDA2Net: Digging under the surface of COVID-19 topics in scientific literature

During the COVID-19 pandemic, the scientific literature related to SARS-...
research
07/17/2021

COVID-19 Multidimensional Kaggle Literature Organization

The unprecedented outbreak of Severe Acute Respiratory Syndrome Coronavi...
research
06/09/2020

EPIC30M: An Epidemics Corpus Of Over 30 Million Relevant Tweets

Since the start of COVID-19, several relevant corpora from various sourc...

Please sign up or login with your details

Forgot password? Click here to reset