Data Augmentation for Training Dialog Models Robust to Speech Recognition Errors

06/10/2020
by   Longshaokan Wang, et al.
0

Speech-based virtual assistants, such as Amazon Alexa, Google assistant, and Apple Siri, typically convert users' audio signals to text data through automatic speech recognition (ASR) and feed the text to downstream dialog models for natural language understanding and response generation. The ASR output is error-prone; however, the downstream dialog models are often trained on error-free text data, making them sensitive to ASR errors during inference time. To bridge the gap and make dialog models more robust to ASR errors, we leverage an ASR error simulator to inject noise into the error-free text data, and subsequently train the dialog models with the augmented data. Compared to other approaches for handling ASR errors, such as using ASR lattice or end-to-end methods, our data augmentation approach does not require any modification to the ASR or downstream dialog models; our approach also does not introduce any additional latency during inference time. We perform extensive experiments on benchmark data and show that our approach improves the performance of downstream dialog models in the presence of ASR errors, and it is particularly effective in the low-resource situations where there are constraints on model size or the training data is scarce.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/10/2021

Revisiting the Boundary between ASR and NLU in the Age of Conversational Dialog Systems

As more users across the world are interacting with dialog agents in the...
research
06/08/2023

Speech-to-Text Adapter and Speech-to-Entity Retriever Augmented LLMs for Speech Understanding

Large Language Models (LLMs) have been applied in the speech domain, oft...
research
11/08/2019

Investigation of Error Simulation Techniques for Learning Dialog Policies for Conversational Error Recovery

Training dialog policies for speech-based virtual assistants requires a ...
research
08/18/2020

Are Neural Open-Domain Dialog Systems Robust to Speech Recognition Errors in the Dialog History? An Empirical Study

Large end-to-end neural open-domain chatbots are becoming increasingly p...
research
09/22/2017

Mitigating the Impact of Speech Recognition Errors on Chatbot using Sequence-to-Sequence Model

We apply sequence-to-sequence model to mitigate the impact of speech rec...
research
12/11/2017

Learning Robust Dialog Policies in Noisy Environments

Modern virtual personal assistants provide a convenient interface for co...
research
07/22/2022

ASR Error Detection via Audio-Transcript entailment

Despite improved performances of the latest Automatic Speech Recognition...

Please sign up or login with your details

Forgot password? Click here to reset