Revisit Few-shot Intent Classification with PLMs: Direct Fine-tuning vs. Continual Pre-training

06/08/2023
by   Haode Zhang, et al.
0

We consider the task of few-shot intent detection, which involves training a deep learning model to classify utterances based on their underlying intents using only a small amount of labeled data. The current approach to address this problem is through continual pre-training, i.e., fine-tuning pre-trained language models (PLMs) on external resources (e.g., conversational corpora, public intent detection datasets, or natural language understanding datasets) before using them as utterance encoders for training an intent classifier. In this paper, we show that continual pre-training may not be essential, since the overfitting problem of PLMs on this task may not be as serious as expected. Specifically, we find that directly fine-tuning PLMs on only a handful of labeled examples already yields decent results compared to methods that employ continual pre-training, and the performance gap diminishes rapidly as the number of labeled data increases. To maximize the utilization of the limited available data, we propose a context augmentation method and leverage sequential self-distillation to boost performance. Comprehensive experiments on real-world benchmarks show that given only two or more labeled samples per class, direct fine-tuning outperforms many strong baselines that utilize external data sources for continual pre-training. The code can be found at https://github.com/hdzhang-code/DFTPlus.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/15/2022

Fine-tuning Pre-trained Language Models for Few-shot Intent Detection: Supervised Pre-training and Isotropization

It is challenging to train a good intent classifier for a task-oriented ...
research
05/15/2023

Recyclable Tuning for Continual Pre-training

Continual pre-training is the paradigm where pre-trained language models...
research
05/09/2023

Going beyond research datasets: Novel intent discovery in the industry setting

Novel intent discovery automates the process of grouping similar message...
research
04/16/2018

Segmentation of both Diseased and Healthy Skin from Clinical Photographs in a Primary Care Setting

This work presents the first segmentation study of both disease and heal...
research
04/06/2022

Knowledge Infused Decoding

Pre-trained language models (LMs) have been shown to memorize a substant...
research
06/06/2023

I'm Afraid I Can't Do That: Predicting Prompt Refusal in Black-Box Generative Language Models

Since the release of OpenAI's ChatGPT, generative language models have a...
research
05/29/2023

Self Information Update for Large Language Models through Mitigating Exposure Bias

Current LLMs have demonstrated remarkable capabilities in addressing use...

Please sign up or login with your details

Forgot password? Click here to reset