In-Context Learning for Text Classification with Many Labels

09/19/2023
by   Aristides Milios, et al.
0

In-context learning (ICL) using large language models for tasks with many labels is challenging due to the limited context window, which makes it difficult to fit a sufficient number of examples in the prompt. In this paper, we use a pre-trained dense retrieval model to bypass this limitation, giving the model only a partial view of the full label space for each inference call. Testing with recent open-source LLMs (OPT, LLaMA), we set new state of the art performance in few-shot settings for three common intent classification datasets, with no finetuning. We also surpass fine-tuned performance on fine-grained sentiment classification in certain cases. We analyze the performance across number of in-context examples and different model scales, showing that larger models are necessary to effectively and consistently make use of larger context lengths for ICL. By running several ablations, we analyze the model's use of: a) the similarity of the in-context examples to the current input, b) the semantic content of the class names, and c) the correct correspondence between examples and labels. We demonstrate that all three are needed to varying degrees depending on the domain, contrary to certain recent works.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/12/2022

Semantic-Oriented Unlabeled Priming for Large-Scale Language Models

Due to the high costs associated with finetuning large language models, ...
research
05/24/2023

EXnet: Efficient In-context Learning for Data-less Text classification

Large pre-trained language models (PLMs) have made significant progress ...
research
09/12/2021

Not All Negatives are Equal: Label-Aware Contrastive Loss for Fine-grained Text Classification

Fine-grained classification involves dealing with datasets with larger n...
research
09/22/2021

Coarse2Fine: Fine-grained Text Classification on Coarsely-grained Annotated Data

Existing text classification methods mainly focus on a fixed label set, ...
research
08/28/2023

Breaking the Bank with ChatGPT: Few-Shot Text Classification for Finance

We propose the use of conversational GPT models for easy and quick few-s...
research
05/18/2023

Efficient Prompting via Dynamic In-Context Learning

The primary way of building AI applications is shifting from training sp...
research
05/24/2022

Partial-input baselines show that NLI models can ignore context, but they don't

When strong partial-input baselines reveal artifacts in crowdsourced NLI...

Please sign up or login with your details

Forgot password? Click here to reset