Noisy Channel Language Model Prompting for Few-Shot Text Classification

08/09/2021
by   Sewon Min, et al.
0

We introduce a noisy channel approach for language model prompting in few-shot text classification. Instead of computing the likelihood of the label given the input (referred as direct models), channel models compute the conditional probability of the input given the label, and are thereby required to explain every word in the input. We use channel models for recently proposed few-shot learning methods with no or very limited updates to the language model parameters, via either in-context demonstration or prompt tuning. Our experiments show that, for both methods, channel models significantly outperform their direct counterparts, which we attribute to their stability, i.e., lower variance and higher worst-case accuracy. We also present extensive ablations that provide recommendations for when to use channel prompt tuning instead of other competitive models (e.g., direct head tuning): channel prompt tuning is preferred when the number of training examples is small, labels in the training data are imbalanced, or generalization to unseen labels is required.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/26/2020

Automatically Identifying Words That Can Serve as Labels for Few-Shot Text Classification

A recent approach for few-shot text classification is to convert textual...
research
10/23/2022

Discriminative Language Model as Semantic Consistency Scorer for Prompt-based Few-Shot Text Classification

This paper proposes a novel prompt-based finetuning method (called DLM-S...
research
06/18/2023

Evolutionary Verbalizer Search for Prompt-based Few Shot Text Classification

Recent advances for few-shot text classification aim to wrap textual inp...
research
11/06/2022

Noisy Channel for Automatic Text Simplification

In this paper we present a simple re-ranking method for Automatic Senten...
research
03/28/2022

Few-Shot Learning with Siamese Networks and Label Tuning

We study the problem of building text classifiers with little or no trai...
research
03/13/2022

Worst Case Matters for Few-Shot Recognition

Few-shot recognition learns a recognition model with very few (e.g., 1 o...
research
07/14/2023

Generating Efficient Training Data via LLM-based Attribute Manipulation

In this paper, we propose a novel method, Chain-of-Thoughts Attribute Ma...

Please sign up or login with your details

Forgot password? Click here to reset