Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners

10/06/2022
by   Seonghyeon Ye, et al.
10

Meta-training, which fine-tunes the language model (LM) on various downstream tasks by maximizing the likelihood of the target label given the task instruction and input instance, has improved the zero-shot task generalization performance. However, meta-trained LMs still struggle to generalize to challenging tasks containing novel labels unseen during meta-training. In this paper, we propose Flipped Learning, an alternative method of meta-training which trains the LM to generate the task instruction given the input instance and label. During inference, the LM trained with Flipped Learning, referred to as Flipped, selects the label option that is most likely to generate the task instruction. On 14 tasks of the BIG-bench benchmark, the 11B-sized Flipped outperforms zero-shot T0-11B and even a 16 times larger 3-shot GPT-3 (175B) on average by 8.4 improvements on unseen labels, outperforming T0-11B by up to +20 score. This indicates that the strong task generalization of Flipped comes from improved generalization to novel labels. We release our code at https://github.com/seonghyeonye/Flipped-Learning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/23/2023

The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning

Large Language Models (LLMs) have shown enhanced capabilities of solving...
research
10/06/2022

Retrieval of Soft Prompt Enhances Zero-Shot Task Generalization

During zero-shot inference with language models (LMs), using hard prompt...
research
10/15/2021

Multitask Prompted Training Enables Zero-Shot Task Generalization

Large language models have recently been shown to attain reasonable zero...
research
02/07/2023

Exploring the Benefits of Training Expert Language Models over Instruction Tuning

Recently, Language Models (LMs) instruction-tuned on multiple tasks, als...
research
07/31/2023

Camoscio: an Italian Instruction-tuned LLaMA

In recent years Large Language Models (LLMs) have increased the state of...
research
04/10/2021

Meta-tuning Language Models to Answer Prompts Better

Large pretrained language models like GPT-3 have acquired a surprising a...
research
05/27/2023

Augmentation-Adapted Retriever Improves Generalization of Language Models as Generic Plug-In

Retrieval augmentation can aid language models (LMs) in knowledge-intens...

Please sign up or login with your details

Forgot password? Click here to reset