Reduce Catastrophic Forgetting of Dense Retrieval Training with Teleportation Negatives

10/31/2022
by   Si Sun, et al.
0

In this paper, we investigate the instability in the standard dense retrieval training, which iterates between model training and hard negative selection using the being-trained model. We show the catastrophic forgetting phenomena behind the training instability, where models learn and forget different negative groups during training iterations. We then propose ANCE-Tele, which accumulates momentum negatives from past iterations and approximates future iterations using lookahead negatives, as "teleportations" along the time axis to smooth the learning process. On web search and OpenQA, ANCE-Tele outperforms previous state-of-the-art systems of similar size, eliminates the dependency on sparse retrieval negatives, and is competitive among systems using significantly more (50x) parameters. Our analysis demonstrates that teleportation negatives reduce catastrophic forgetting and improve convergence speed for dense retrieval training. Our code is available at https://github.com/OpenMatch/ANCE-Tele.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/11/2022

Contrastive Supervised Distillation for Continual Representation Learning

In this paper, we propose a novel training procedure for the continual r...
research
04/29/2020

Reducing catastrophic forgetting with learning on synthetic data

Catastrophic forgetting is a problem caused by neural networks' inabilit...
research
08/24/2022

DPTDR: Deep Prompt Tuning for Dense Passage Retrieval

Deep prompt tuning (DPT) has gained great success in most natural langua...
research
01/23/2022

Learning to Minimize the Remainder in Supervised Learning

The learning process of deep learning methods usually updates the model'...
research
06/05/2023

Benchmarking Middle-Trained Language Models for Neural Search

Middle training methods aim to bridge the gap between the Masked Languag...
research
12/15/2021

Lifelong Generative Modelling Using Dynamic Expansion Graph Model

Variational Autoencoders (VAEs) suffer from degenerated performance, whe...
research
05/19/2022

How catastrophic can catastrophic forgetting be in linear regression?

To better understand catastrophic forgetting, we study fitting an overpa...

Please sign up or login with your details

Forgot password? Click here to reset