Towards Robust Neural Retrieval Models with Synthetic Pre-Training

04/15/2021
by   Revanth Gangi Reddy, et al.
0

Recent work has shown that commonly available machine reading comprehension (MRC) datasets can be used to train high-performance neural information retrieval (IR) systems. However, the evaluation of neural IR has so far been limited to standard supervised learning settings, where they have outperformed traditional term matching baselines. We conduct in-domain and out-of-domain evaluations of neural IR, and seek to improve its robustness across different scenarios, including zero-shot settings. We show that synthetic training examples generated using a sequence-to-sequence generator can be effective towards this goal: in our experiments, pre-training with synthetic examples improves retrieval performance in both in-domain and out-of-domain evaluation on five different test sets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/02/2020

End-to-End QA on COVID-19: Domain Adaptation with Synthetic Training

End-to-end question answering (QA) requires both information retrieval (...
research
04/24/2022

Entity-Conditioned Question Generation for Robust Attention Distribution in Neural Information Retrieval

We show that supervised neural information retrieval (IR) models are pro...
research
08/16/2019

CFO: A Framework for Building Production NLP Systems

This paper introduces a novel orchestration framework, called CFO (COMPU...
research
07/28/2021

Domain-matched Pre-training Tasks for Dense Retrieval

Pre-training on larger datasets with ever increasing model size is now a...
research
10/24/2020

Improved Synthetic Training for Reading Comprehension

Automatically generated synthetic training examples have been shown to i...
research
05/10/2022

From Distillation to Hard Negative Sampling: Making Sparse Neural IR Models More Effective

Neural retrievers based on dense representations combined with Approxima...
research
07/02/2023

BioCPT: Contrastive Pre-trained Transformers with Large-scale PubMed Search Logs for Zero-shot Biomedical Information Retrieval

Information retrieval (IR) is essential in biomedical knowledge acquisit...

Please sign up or login with your details

Forgot password? Click here to reset