C2C-GenDA: Cluster-to-Cluster Generation for Data Augmentation of Slot Filling

12/13/2020
by   Yutai Hou, et al.
0

Slot filling, a fundamental module of spoken language understanding, often suffers from insufficient quantity and diversity of training data. To remedy this, we propose a novel Cluster-to-Cluster generation framework for Data Augmentation (DA), named C2C-GenDA. It enlarges the training set by reconstructing existing utterances into alternative expressions while keeping semantic. Different from previous DA works that reconstruct utterances one by one independently, C2C-GenDA jointly encodes multiple existing utterances of the same semantics and simultaneously decodes multiple unseen expressions. Jointly generating multiple new utterances allows to consider the relations between generated instances and encourages diversity. Besides, encoding multiple existing utterances endows C2C with a wider view of existing expressions, helping to reduce generation that duplicates existing data. Experiments on ATIS and Snips datasets show that instances augmented by C2C-GenDA improve slot filling by 7.99 (11.9 respectively, when there are only hundreds of training utterances.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/04/2018

Sequence-to-Sequence Data Augmentation for Dialogue Language Understanding

In this paper, we study the problem of data augmentation for language un...
research
08/19/2021

Augmenting Slot Values and Contexts for Spoken Language Understanding with Pretrained Models

Spoken Language Understanding (SLU) is one essential step in building a ...
research
11/04/2018

Elastic CRFs for Open-ontology Slot Filling

Slot filling is a crucial component in task-oriented dialog systems, whi...
research
09/17/2018

Robust Spoken Language Understanding via Paraphrasing

Learning intents and slot labels from user utterances is a fundamental s...
research
10/19/2022

Explainable Slot Type Attentions to Improve Joint Intent Detection and Slot Filling

Joint intent detection and slot filling is a key research topic in natur...
research
02/02/2021

Neural Data Augmentation via Example Extrapolation

In many applications of machine learning, certain categories of examples...
research
09/07/2023

Introducing "Forecast Utterance" for Conversational Data Science

Envision an intelligent agent capable of assisting users in conducting f...

Please sign up or login with your details

Forgot password? Click here to reset