DeepAI AI Chat
Log In Sign Up

A Study on FGSM Adversarial Training for Neural Retrieval

by   Simon Lupart, et al.

Neural retrieval models have acquired significant effectiveness gains over the last few years compared to term-based methods. Nevertheless, those models may be brittle when faced to typos, distribution shifts or vulnerable to malicious attacks. For instance, several recent papers demonstrated that such variations severely impacted models performances, and then tried to train more resilient models. Usual approaches include synonyms replacements or typos injections – as data-augmentation – and the use of more robust tokenizers (characterBERT, BPE-dropout). To further complement the literature, we investigate in this paper adversarial training as another possible solution to this robustness issue. Our comparison includes the two main families of BERT-based neural retrievers, i.e. dense and sparse, with and without distillation techniques. We then demonstrate that one of the most simple adversarial training techniques – the Fast Gradient Sign Method (FGSM) – can improve first stage rankers robustness and effectiveness. In particular, FGSM increases models performances on both in-domain and out-of-domain distributions, and also on queries with typos, for multiple neural retrievers.


page 1

page 2

page 3

page 4


ℓ_∞-Robustness and Beyond: Unleashing Efficient Adversarial Training

Neural networks are vulnerable to adversarial attacks: adding well-craft...

Annealing Self-Distillation Rectification Improves Adversarial Training

In standard adversarial training, models are optimized to fit one-hot la...

GAT: Guided Adversarial Training with Pareto-optimal Auxiliary Tasks

While leveraging additional training data is well established to improve...

Towards Improving Adversarial Training of NLP Models

Adversarial training, a method for learning robust deep neural networks,...

Retrieval-Enhanced Adversarial Training for Neural Response Generation

Dialogue systems are usually built on either generation-based or retriev...

A Light Recipe to Train Robust Vision Transformers

In this paper, we ask whether Vision Transformers (ViTs) can serve as an...

Can collaborative learning be private, robust and scalable?

We investigate the effectiveness of combining differential privacy, mode...