Data Augmentation for Voice-Assistant NLU using BERT-based Interchangeable Rephrase

04/16/2021
by   Akhila Yerukola, et al.
8

We introduce a data augmentation technique based on byte pair encoding and a BERT-like self-attention model to boost performance on spoken language understanding tasks. We compare and evaluate this method with a range of augmentation techniques encompassing generative models such as VAEs and performance-boosting techniques such as synonym replacement and back-translation. We show our method performs strongly on domain and intent classification tasks for a voice assistant and in a user-study focused on utterance naturalness and semantic similarity.

READ FULL TEXT

page 2

page 4

research
04/29/2020

Data Augmentation for Spoken Language Understanding via Pretrained Models

The training of spoken language understanding (SLU) models often faces t...
research
01/31/2019

EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks

We present EDA: easy data augmentation techniques for boosting performan...
research
08/11/2020

Surgical Mask Detection with Convolutional Neural Networks and Data Augmentations on Spectrograms

In many fields of research, labeled datasets are hard to acquire. This i...
research
08/25/2023

ChatGPT as Data Augmentation for Compositional Generalization: A Case Study in Open Intent Detection

Open intent detection, a crucial aspect of natural language understandin...
research
08/28/2019

Data Augmentation with Atomic Templates for Spoken Language Understanding

Spoken Language Understanding (SLU) converts user utterances into struct...
research
04/27/2021

Semi-supervised Interactive Intent Labeling

Building the Natural Language Understanding (NLU) modules of task-orient...
research
09/20/2023

AttentionMix: Data augmentation method that relies on BERT attention mechanism

The Mixup method has proven to be a powerful data augmentation technique...

Please sign up or login with your details

Forgot password? Click here to reset