Data Augmentation for Intent Classification with Off-the-shelf Large Language Models

04/05/2022
by   Gaurav Sahu, et al.
13

Data augmentation is a widely employed technique to alleviate the problem of data scarcity. In this work, we propose a prompting-based approach to generate labelled training data for intent classification with off-the-shelf language models (LMs) such as GPT-3. An advantage of this method is that no task-specific LM-fine-tuning for data generation is required; hence the method requires no hyper-parameter tuning and is applicable even when the available training data is very scarce. We evaluate the proposed method in a few-shot setting on four diverse intent classification tasks. We find that GPT-generated data significantly boosts the performance of intent classifiers when intents in consideration are sufficiently distinct from each other. In tasks with semantically close intents, we observe that the generated data is less helpful. Our analysis shows that this is because GPT often generates utterances that belong to a closely-related intent instead of the desired one. We present preliminary evidence that a prompting-based GPT classifier could be helpful in filtering the generated data to enhance its quality.

READ FULL TEXT

page 6

page 7

page 11

research
05/11/2023

Exploring Zero and Few-shot Techniques for Intent Classification

Conversational NLU providers often need to scale to thousands of intent-...
research
05/18/2022

PromptDA: Label-guided Data Augmentation for Prompt-based Few Shot Learners

Recent advances on large pre-trained language models (PLMs) lead impress...
research
04/18/2021

GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation

Large-scale language models such as GPT-3 are excellent few-shot learner...
research
08/25/2023

ChatGPT as Data Augmentation for Compositional Generalization: A Case Study in Open Intent Detection

Open intent detection, a crucial aspect of natural language understandin...
research
09/13/2021

Effectiveness of Pre-training for Few-shot Intent Classification

This paper investigates the effectiveness of pre-training for few-shot i...
research
05/21/2023

Automated Few-shot Classification with Instruction-Finetuned Language Models

A particularly successful class of approaches for few-shot learning comb...
research
02/21/2022

A new data augmentation method for intent classification enhancement and its application on spoken conversation datasets

Intent classifiers are vital to the successful operation of virtual agen...

Please sign up or login with your details

Forgot password? Click here to reset