SLEDGE: A Simple Yet Effective Baseline for Coronavirus Scientific Knowledge Search

05/05/2020
by   Sean MacAvaney, et al.
0

With worldwide concerns surrounding the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), there is a rapidly growing body of literature on the virus. Clinicians, researchers, and policy-makers need a way to effectively search these articles. In this work, we present a search system called SLEDGE, which utilizes SciBERT to effectively re-rank articles. We train the model on a general-domain answer ranking dataset, and transfer the relevance signals to SARS-CoV-2 for evaluation. We observe SLEDGE's effectiveness as a strong baseline on the TREC-COVID challenge (topping the learderboard with an nDCG@10 of 0.6844). Insights provided by a detailed analysis provide some potential future directions to explore, including the importance of filtering by date and the potential of neural methods that rely more heavily on count signals. We release the code to facilitate future work on this critical task at https://github.com/Georgetown-IR-Lab/covid-neural-ir

READ FULL TEXT
research
10/12/2020

SLEDGE-Z: A Zero-Shot Baseline for COVID-19 Literature Search

With worldwide concerns surrounding the Severe Acute Respiratory Syndrom...
research
07/08/2015

Mining and Analyzing the Future Works in Scientific Articles

Future works in scientific articles are valuable for researchers and the...
research
06/30/2017

Co-PACRR: A Context-Aware Neural IR Model for Ad-hoc Retrieval

Neural IR models, such as DRMM and PACRR, have achieved strong results b...
research
05/21/2023

IR Models and the COVID-19 Pandemic: A Comparative Study of Performance and Challenges

This research study investigates the efficiency of different information...
research
07/08/2022

Lessons from Deep Learning applied to Scholarly Information Extraction: What Works, What Doesn't, and Future Directions

Understanding key insights from full-text scholarly articles is essentia...
research
11/03/2020

CMT in TREC-COVID Round 2: Mitigating the Generalization Gaps from Web to Special Domain Search

Neural rankers based on deep pretrained language models (LMs) have been ...
research
06/07/2023

Good Data, Large Data, or No Data? Comparing Three Approaches in Developing Research Aspect Classifiers for Biomedical Papers

The rapid growth of scientific publications, particularly during the COV...

Please sign up or login with your details

Forgot password? Click here to reset