Augmentation-Adapted Retriever Improves Generalization of Language Models as Generic Plug-In

05/27/2023
by   Zichun Yu, et al.
0

Retrieval augmentation can aid language models (LMs) in knowledge-intensive tasks by supplying them with external information. Prior works on retrieval augmentation usually jointly fine-tune the retriever and the LM, making them closely coupled. In this paper, we explore the scheme of generic retrieval plug-in: the retriever is to assist target LMs that may not be known beforehand or are unable to be fine-tuned together. To retrieve useful documents for unseen target LMs, we propose augmentation-adapted retriever (AAR), which learns LM's preferences obtained from a known source LM. Experiments on the MMLU and PopQA datasets demonstrate that our AAR trained with a small source LM is able to significantly improve the zero-shot generalization of larger target LMs ranging from 250M Flan-T5 to 175B InstructGPT. Further analysis indicates that the preferences of different LMs overlap, enabling AAR trained with a single source LM to serve as a generic plug-in for various target LMs. Our code is open-sourced at https://github.com/OpenMatch/Augmentation-Adapted-Retriever.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/06/2022

Retrieval of Soft Prompt Enhances Zero-Shot Task Generalization

During zero-shot inference with language models (LMs), using hard prompt...
research
02/07/2023

Augmenting Zero-Shot Dense Retrievers with Plug-in Mixture-of-Memories

In this paper we improve the zero-shot generalization ability of languag...
research
10/11/2022

Retrieval Augmentation for T5 Re-ranker using External Sources

Retrieval augmentation has shown promising improvements in different tas...
research
02/07/2022

To Tune or Not To Tune? Zero-shot Models for Legal Case Entailment

There has been mounting evidence that pretrained language models fine-tu...
research
10/06/2022

Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners

Meta-training, which fine-tunes the language model (LM) on various downs...
research
04/19/2023

Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agent

Large Language Models (LLMs) have demonstrated a remarkable ability to g...
research
02/23/2023

On the Generalization Ability of Retrieval-Enhanced Transformers

Recent work on the Retrieval-Enhanced Transformer (RETRO) model has show...

Please sign up or login with your details

Forgot password? Click here to reset