Data balancing for boosting performance of low-frequency classes in Spoken Language Understanding

08/06/2020
by   Judith Gaspers, et al.
0

Despite the fact that data imbalance is becoming more and more common in real-world Spoken Language Understanding (SLU) applications, it has not been studied extensively in the literature. To the best of our knowledge, this paper presents the first systematic study on handling data imbalance for SLU. In particular, we discuss the application of existing data balancing techniques for SLU and propose a multi-task SLU model for intent classification and slot filling. Aiming to avoid over-fitting, in our model methods for data balancing are leveraged indirectly via an auxiliary task which makes use of a class-balanced batch generator and (possibly) synthetic data. Our results on a real-world dataset indicate that i) our proposed model can boost performance on low frequency intents significantly while avoiding a potential performance decrease on the head intents, ii) synthetic data are beneficial for bootstrapping new intents when realistic data are not available, but iii) once a certain amount of realistic data becomes available, using synthetic data in the auxiliary task only yields better performance than adding them to the primary task training data, and iv) in a joint training scenario, balancing the intent distribution individually improves not only intent classification but also slot filling performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/05/2019

A Stack-Propagation Framework with Token-Level Intent Detection for Spoken Language Understanding

Intent detection and slot filling are two main tasks for building a spok...
research
09/06/2016

Joint Online Spoken Language Understanding and Language Modeling with Recurrent Neural Networks

Speaker intent detection and semantic slot filling are two critical task...
research
05/15/2021

From Masked Language Modeling to Translation: Non-English Auxiliary Tasks Improve Zero-shot Spoken Language Understanding

The lack of publicly available evaluation data for low-resource language...
research
05/18/2023

Generalized Multiple Intent Conditioned Slot Filling

Natural language understanding includes the tasks of intent detection (i...
research
10/07/2022

A Unified Framework for Multi-intent Spoken Language Understanding with prompting

Multi-intent Spoken Language Understanding has great potential for wides...
research
11/09/2021

NATURE: Natural Auxiliary Text Utterances for Realistic Spoken Language Evaluation

Slot-filling and intent detection are the backbone of conversational age...
research
05/15/2018

Marrying up Regular Expressions with Neural Networks: A Case Study for Spoken Language Understanding

The success of many natural language processing (NLP) tasks is bound by ...

Please sign up or login with your details

Forgot password? Click here to reset