Robustness Testing of Language Understanding in Dialog Systems

by   Jiexi Liu, et al.

Most language understanding models in dialog systems are trained on a small amount of annotated training data, and evaluated in a small set from the same distribution. However, these models can lead to system failure or undesirable outputs when being exposed to natural perturbation in practice. In this paper, we conduct comprehensive evaluation and analysis with respect to the robustness of natural language understanding models, and introduce three important aspects related to language understanding in real-world dialog systems, namely, language variety, speech characteristics, and noise perturbation. We propose a model-agnostic toolkit LAUG to approximate natural perturbation for testing the robustness issues in dialog systems. Four data augmentation approaches covering the three aspects are assembled in LAUG, which reveals critical robustness issues in state-of-the-art models. The augmented dataset through LAUG can be used to facilitate future research on the robustness testing of language understanding in dialog systems.


page 1

page 2

page 3

page 4


Calling Out Bluff: Attacking the Robustness of Automatic Scoring Systems with Simple Adversarial Testing

A significant progress has been made in deep-learning based Automatic Es...

Convex Polytope Modelling for Unsupervised Derivation of Semantic Structure for Data-efficient Natural Language Understanding

Popular approaches for Natural Language Understanding (NLU) usually rely...

Accelerating Natural Language Understanding in Task-Oriented Dialog

Task-oriented dialog models typically leverage complex neural architectu...

Out-of-domain Detection for Natural Language Understanding in Dialog Systems

In natural language understanding components, detecting out-of-domain (O...

Improving Robustness of Neural Dialog Systems in a Data-Efficient Way with Turn Dropout

Neural network-based dialog models often lack robustness to anomalous, o...

Improving Robustness of Task Oriented Dialog Systems

Task oriented language understanding in dialog systems is often modeled ...

FewNLU: Benchmarking State-of-the-Art Methods for Few-Shot Natural Language Understanding

The few-shot natural language understanding (NLU) task has attracted muc...