Advancing Stuttering Detection via Data Augmentation, Class-Balanced Loss and Multi-Contextual Deep Learning

02/21/2023
by   Shakeel A. Sheikh, et al.
0

Stuttering is a neuro-developmental speech impairment characterized by uncontrolled utterances (interjections) and core behaviors (blocks, repetitions, and prolongations), and is caused by the failure of speech sensorimotors. Due to its complex nature, stuttering detection (SD) is a difficult task. If detected at an early stage, it could facilitate speech therapists to observe and rectify the speech patterns of persons who stutter (PWS). The stuttered speech of PWS is usually available in limited amounts and is highly imbalanced. To this end, we address the class imbalance problem in the SD domain via a multibranching (MB) scheme and by weighting the contribution of classes in the overall loss function, resulting in a huge improvement in stuttering classes on the SEP-28k dataset over the baseline (StutterNet). To tackle data scarcity, we investigate the effectiveness of data augmentation on top of a multi-branched training scheme. The augmented training outperforms the MB StutterNet (clean) by a relative margin of 4.18 F1-score (F1). In addition, we propose a multi-contextual (MC) StutterNet, which exploits different contexts of the stuttered speech, resulting in an overall improvement of 4.48 StutterNet. Finally, we have shown that applying data augmentation in the cross-corpora scenario can improve the overall SD performance by a relative margin of 13.23

READ FULL TEXT

page 1

page 5

page 7

page 10

page 12

research
08/09/2022

Generative Data Augmentation Guided by Triplet Loss for Speech Emotion Recognition

Speech Emotion Recognition (SER) is crucial for human-computer interacti...
research
04/12/2022

Overlapping Word Removal is All You Need: Revisiting Data Imbalance in Hope Speech Detection

Hope Speech Detection, a task of recognizing positive expressions, has m...
research
08/02/2021

Adversarial Data Augmentation for Disordered Speech Recognition

Automatic recognition of disordered speech remains a highly challenging ...
research
12/05/2020

Enhanced Offensive Language Detection Through Data Augmentation

Detecting offensive language on social media is an important task. The I...
research
07/02/2020

Data Augmenting Contrastive Learning of Speech Representations in the Time Domain

Contrastive Predictive Coding (CPC), based on predicting future segments...
research
10/11/2022

T5 for Hate Speech, Augmented Data and Ensemble

We conduct relatively extensive investigations of automatic hate speech ...
research
08/08/2020

Deep F-measure Maximization for End-to-End Speech Understanding

Spoken language understanding (SLU) datasets, like many other machine le...

Please sign up or login with your details

Forgot password? Click here to reset