CO-Search: COVID-19 Information Retrieval with Semantic Search, Question Answering, and Abstractive Summarization

06/17/2020
by   Andre Esteva, et al.
11

The COVID-19 global pandemic has resulted in international efforts to understand, track, and mitigate the disease, yielding a significant corpus of COVID-19 and SARS-CoV-2-related publications across scientific disciplines. As of May 2020, 128,000 coronavirus-related publications have been collected through the COVID-19 Open Research Dataset Challenge. Here we present CO-Search, a retriever-ranker semantic search engine designed to handle complex queries over the COVID-19 literature, potentially aiding overburdened health workers in finding scientific answers during a time of crisis. The retriever is built from a Siamese-BERT encoder that is linearly composed with a TF-IDF vectorizer, and reciprocal-rank fused with a BM25 vectorizer. The ranker is composed of a multi-hop question-answering module, that together with a multi-paragraph abstractive summarizer adjust retriever scores. To account for the domain-specific and relatively limited dataset, we generate a bipartite graph of document paragraphs and citations, creating 1.3 million (citation title, paragraph) tuples for training the encoder. We evaluate our system on the data of the TREC-COVID information retrieval challenge. CO-Search obtains top performance on the datasets of the first and second rounds, across several key metrics: normalized discounted cumulative gain, precision, mean average precision, and binary preference.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/14/2022

Towards Semantic Search for Community Question Answering for Mortgage Officers

Community Question Answering (CQA) has gained increasing popularity in m...
research
11/08/2022

COV19IR : COVID-19 Domain Literature Information Retrieval

Increasing number of COVID-19 research literatures cause new challenges ...
research
07/06/2020

Searching Scientific Literature for Answers on COVID-19 Questions

Finding answers related to a pandemic of a novel disease raises new chal...
research
06/30/2021

A Search Engine for Scientific Publications: a Cybersecurity Case Study

Cybersecurity is a very challenging topic of research nowadays, as digit...
research
09/05/2020

Vapur: A Search Engine to Find Related Protein-Compound Pairs in COVID-19 Literature

Coronavirus Disease of 2019 (COVID-19) created dire consequences globall...
research
01/08/2021

Multistage BiCross Encoder: Team GATE Entry for MLIA Multilingual Semantic Search Task 2

The Coronavirus (COVID-19) pandemic has led to a rapidly growing `infode...

Please sign up or login with your details

Forgot password? Click here to reset