Bidirectional Language Models Are Also Few-shot Learners

09/29/2022
by   Ajay Patel, et al.
3

Large language models such as GPT-3 (Brown et al., 2020) can perform arbitrary tasks without undergoing fine-tuning after being prompted with only a few labeled examples. An arbitrary task can be reformulated as a natural language prompt, and a language model can be asked to generate the completion, indirectly performing the task in a paradigm known as prompt-based learning. To date, emergent prompt-based learning capabilities have mainly been demonstrated for unidirectional language models. However, bidirectional language models pre-trained on denoising objectives such as masked language modeling produce stronger learned representations for transfer learning. This motivates the possibility of prompting bidirectional models, but their pre-training objectives have made them largely incompatible with the existing prompting paradigm. We present SAP (Sequential Autoregressive Prompting), a technique that enables the prompting of bidirectional models. Utilizing the machine translation task as a case study, we prompt the bidirectional mT5 model (Xue et al., 2021) with SAP and demonstrate its few-shot and zero-shot translations outperform the few-shot translations of unidirectional models like GPT-3 and XGLM (Lin et al., 2021), despite mT5's approximately 50 further show SAP is effective on question answering and summarization. For the first time, our results demonstrate prompt-based learning is an emergent property of a broader class of language models, rather than only unidirectional models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/11/2021

Unsupervised Neural Machine Translation with Generative Language Models Only

We show how to derive state-of-the-art unsupervised neural machine trans...
research
02/02/2022

Co-training Improves Prompt-based Learning for Large Language Models

We demonstrate that co-training (Blum Mitchell, 1998) can improve th...
research
04/14/2021

Learning How to Ask: Querying LMs with Mixtures of Soft Prompts

Natural-language prompts have recently been used to coax pretrained lang...
research
05/24/2022

On the Role of Bidirectionality in Language Model Pre-Training

Prior work on language model pre-training has explored different archite...
research
02/15/2021

Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm

Prevailing methods for mapping large generative language models to super...
research
07/28/2021

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

This paper surveys and organizes research works in a new paradigm in nat...
research
11/28/2022

Arguments to Key Points Mapping with Prompt-based Learning

Handling and digesting a huge amount of information in an efficient mann...

Please sign up or login with your details

Forgot password? Click here to reset