Deep F-measure Maximization for End-to-End Speech Understanding

08/08/2020
by   Leda Sarı, et al.
0

Spoken language understanding (SLU) datasets, like many other machine learning datasets, usually suffer from the label imbalance problem. Label imbalance usually causes the learned model to replicate similar biases at the output which raises the issue of unfairness to the minority classes in the dataset. In this work, we approach the fairness problem by maximizing the F-measure instead of accuracy in neural network model training. We propose a differentiable approximation to the F-measure and train the network with this objective using standard backpropagation. We perform experiments on two standard fairness datasets, Adult, and Communities and Crime, and also on speech-to-intent detection on the ATIS dataset and speech-to-image concept classification on the Speech-COCO dataset. In all four of these tasks, F-measure maximization results in improved micro-F1 scores, with absolute improvements of up to 8 cross-entropy loss function. In the two multi-class SLU tasks, the proposed approach significantly improves class coverage, i.e., the number of classes with positive recall.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/26/2022

Towards Reducing the Need for Speech Training Data To Build Spoken Language Understanding Systems

The lack of speech data annotated with labels required for spoken langua...
research
05/22/2023

Zero-Shot End-to-End Spoken Language Understanding via Cross-Modal Selective Self-Training

End-to-end (E2E) spoken language understanding (SLU) is constrained by t...
research
01/28/2022

Improving End-to-End Models for Set Prediction in Spoken Language Understanding

The goal of spoken language understanding (SLU) systems is to determine ...
research
06/16/2021

End-to-End Spoken Language Understanding for Generalized Voice Assistants

End-to-end (E2E) spoken language understanding (SLU) systems predict utt...
research
11/24/2022

Bidirectional Representations for Low Resource Spoken Language Understanding

Most spoken language understanding systems use a pipeline approach compo...
research
02/21/2023

Advancing Stuttering Detection via Data Augmentation, Class-Balanced Loss and Multi-Contextual Deep Learning

Stuttering is a neuro-developmental speech impairment characterized by u...
research
10/14/2020

Multi-class segmentation under severe class imbalance: A case study in roof damage assessment

The task of roof damage classification and segmentation from overhead im...

Please sign up or login with your details

Forgot password? Click here to reset