Entity Aware Syntax Tree Based Data Augmentation for Natural Language Understanding

09/06/2022
by   Jiaxing Xu, et al.
0

Understanding the intention of the users and recognizing the semantic entities from their sentences, aka natural language understanding (NLU), is the upstream task of many natural language processing tasks. One of the main challenges is to collect a sufficient amount of annotated data to train a model. Existing research about text augmentation does not abundantly consider entity and thus performs badly for NLU tasks. To solve this problem, we propose a novel NLP data augmentation technique, Entity Aware Data Augmentation (EADA), which applies a tree structure, Entity Aware Syntax Tree (EAST), to represent sentences combined with attention on the entity. Our EADA technique automatically constructs an EAST from a small amount of annotated data, and then generates a large number of training instances for intent detection and slot filling. Experimental results on four datasets showed that the proposed technique significantly outperforms the existing data augmentation methods in terms of both accuracy and generalization ability.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/12/2022

TreeMix: Compositional Constituency-based Data Augmentation for Natural Language Understanding

Data augmentation is an effective approach to tackle over-fitting. Many ...
research
08/25/2023

ChatGPT as Data Augmentation for Compositional Generalization: A Case Study in Open Intent Detection

Open intent detection, a crucial aspect of natural language understandin...
research
12/04/2020

Delexicalized Paraphrase Generation

We present a neural model for paraphrasing and train it to generate dele...
research
05/26/2023

GDA: Generative Data Augmentation Techniques for Relation Extraction Tasks

Relation extraction (RE) tasks show promising performance in extracting ...
research
08/19/2021

Augmenting Slot Values and Contexts for Spoken Language Understanding with Pretrained Models

Spoken Language Understanding (SLU) is one essential step in building a ...
research
01/07/2022

Semantic-based Data Augmentation for Math Word Problems

It's hard for neural MWP solvers to deal with tiny local variances. In M...
research
09/14/2022

vec2text with Round-Trip Translations

We investigate models that can generate arbitrary natural language text ...

Please sign up or login with your details

Forgot password? Click here to reset